Using Deep Learning to Extract Scenery Information in Real Time Spatiotemporal Compressed Sensing

One of the problems of real time compressed sensing system is the computational cost of the reconstruction algorithms. It is especially problematic for close loop sensory applications where the sensory parameters needs to be constantly adjust to adapt to a dynamic scene. Through a preliminary experiment with MNIST dataset, we showed that we can extract some scene information (object recognition, scene movement direction and speed) based on the compressed samples using a deep convolutional neural network. It achieves 100% percent accuracy in distinguishing moving velocity, 96.22% in recognizing the digit and 90.04% in detecting moving direction after the code images are re-centered. Even though the classification accuracy drops slightly compared to using original videos, the computational speed is two time faster than classification on videos directly. This method also eliminates the need for sparse reconstruction prior to classification.

[1]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Shree K. Nayar,et al.  Video from a single coded exposure photograph using a learned over-complete dictionary , 2011, 2011 International Conference on Computer Vision.

[4]  Ralph Etienne-Cummings,et al.  CMOS implementation of pixel-wise coded exposure imaging for insect-based sensor node , 2015, 2015 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[5]  Ralph Etienne-Cummings,et al.  Spatiotemporal compressed sensing for video compression , 2017, 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS).

[6]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Jie Zhang,et al.  Compact all-CMOS spatiotemporal compressive sensing video camera with pixel-wise coded exposure. , 2016, Optics express.

[9]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[10]  Riccardo Rovatti,et al.  Hardware-Algorithms Co-Design and Implementation of an Analog-to-Information Converter for Biosignals Based on Compressed Sensing , 2016, IEEE Transactions on Biomedical Circuits and Systems.

[11]  Ralph Etienne-Cummings,et al.  Live demonstration: A compact all-CMOS spatiotemporal compressed sensing video camera , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[12]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[13]  Ralph Etienne-Cummings,et al.  A closed-loop compressive-sensing-based neural recording system , 2015, Journal of neural engineering.

[14]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[15]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[16]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[17]  Ralph Etienne-Cummings,et al.  Compressed sensing block-wise exposure control algorithm using optical flow estimation , 2015, 2015 IEEE Biomedical Circuits and Systems Conference (BioCAS).