论文信息 - Comparison between Frame-Constrained Fix-Pixel-Value and Frame-Free Spiking-Dynamic-Pixel ConvNets for Visual Processing

Comparison between Frame-Constrained Fix-Pixel-Value and Frame-Free Spiking-Dynamic-Pixel ConvNets for Visual Processing

Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons.

[1] G. Shepherd. The Synaptic Organization of the Brain , 1979 .

[2] W. R. Adey. The synaptic organization of the brain. 2nd edn. , 1981 .

[3] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[4] T. Sacktor. The Synaptic Organization of the Brain (3rd Ed.) , 1991 .

[5] Lawrence D. Jackel,et al. An analog neural network processor with programmable topology , 1991 .

[6] Lawrence D. Jackel,et al. Application of the ANNA neural network chip to high-speed character recognition , 1992, IEEE Trans. Neural Networks.

[7] Massimo A. Sivilotti,et al. Wiring considerations in analog VLSI systems, with application to field-programmable networks , 1992 .

[8] Steven Pigeon,et al. VIP: an FPGA-based processor for image processing and neural networks , 1996, Proceedings of Fifth International Conference on Microelectronics for Neural Networks.

[9] Denis Fize,et al. Speed of processing in the human visual system , 1996, Nature.

[10] Eric A. Vittoz,et al. An integrated cortical layer for orientation enhancement , 1997 .

[11] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[12] Claus Nebauer,et al. Evaluation of convolutional neural networks for visual recognition , 1998, IEEE Trans. Neural Networks.

[13] Kwabena Boahen,et al. Retinomorphic chips that see quadruple images , 1999, Proceedings of the Seventh International Conference on Microelectronics for Neural, Fuzzy and Bio-Inspired Systems.

[14] Andreas G. Andreou,et al. AER image filtering architecture for vision-processing systems , 1999 .

[15] E. Culurciello,et al. A biomorphic digital image sensor , 2003, IEEE J. Solid State Circuits.

[16] Pierre-Yves Burgi,et al. A 128 /spl times/ 128 pixel 120 dB dynamic range vision sensor chip for image contrast and orientation extraction , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[17] Yann LeCun,et al. Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[18] Kwabena Boahen,et al. Optic nerve signals in a neuromorphic chip II: testing and results , 2004, IEEE Transactions on Biomedical Engineering.

[19] Bertram E. Shi,et al. Neuromorphic implementation of orientation hypercolumns , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[20] Mehdi Azadmehr. A foveated aer imager chip , 2005 .

[21] T. Delbruck,et al. A 128 128 120 dB 15 s Latency Asynchronous Temporal Contrast Vision Sensor , 2006 .

[22] Bernabé Linares-Barranco,et al. A Neuromorphic Cortical-Layer Microchip for Spike-Based Event Processing Vision Systems , 2006, IEEE Transactions on Circuits and Systems I: Regular Papers.

[23] Yann LeCun,et al. Large-scale Learning with SVM and Convolutional for Generic Object Categorization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24] Tomaso Poggio,et al. Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines , 2006 .

[25] Eugenio Culurciello,et al. An Address-Event Image Sensor Network , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[26] Gert Cauwenberghs,et al. A Multichip Neuromorphic System for Spike-Based Visual Information Processing , 2007, Neural Computation.

[27] Amine Bermak,et al. Arbitrated Time-to-First Spike CMOS Image Sensor With On-Chip Histogram Equalization , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[28] Yann LeCun,et al. Online Learning for Offroad Robots: Spatial Label Propagation to Learn Long-Range Traversability , 2007, Robotics: Science and Systems.

[29] Tobi Delbrück,et al. A Multichip Pulse-Based Neuromorphic Infrastructure and Its Application to a Model of Orientation Selectivity , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.

[30] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Bernabé Linares-Barranco,et al. A Spatial Contrast Retina With On-Chip Calibration for Neuromorphic Spike-Based AER Vision Systems , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.

[32] André van Schaik,et al. AER EAR: A Matched Silicon Cochlea Pair With Address Event Representation Interface , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[33] Massimo Gottardi,et al. A 100μW 64×128-Pixel Contrast-Based Asynchronous Binary Vision Sensor for Wireless Sensor Networks , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[34] Eero P. Simoncelli,et al. Nonlinear image representation using divisive normalization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35] Tobi Delbrück,et al. A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[36] Yingxue Wang,et al. Quantification of a Spike-Based Winner-Take-All VLSI Network , 2008, IEEE Transactions on Circuits and Systems I: Regular Papers.

[37] Nicolas Pinto,et al. Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[38] Tobias Delbrück,et al. Frame-free dynamic digital vision , 2008 .

[39] C. Stachniss,et al. Online Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long-Range Traversability , 2008 .

[40] Bernabé Linares-Barranco,et al. On Real-Time AER 2-D Convolutions Hardware for Neuromorphic Spike-Based Cortical Processing , 2008, IEEE Transactions on Neural Networks.

[41] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[42] Yann LeCun,et al. Learning long‐range vision for autonomous off‐road driving , 2009, J. Field Robotics.

[43] Frederico A. C. Azevedo,et al. Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled‐up primate brain , 2009, The Journal of comparative neurology.

[44] Jean-Luc Nagel,et al. An SoC combining a 132dB QVGA pixel array and a 32b DSP/MCU processor for vision applications , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[45] Tobi Delbrück,et al. CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware Sensory–Processing– Learning–Actuating System for High-Speed Visual Object Recognition and Tracking , 2009, IEEE Transactions on Neural Networks.

[46] Daniel Matolin,et al. A QVGA 143dB dynamic range asynchronous address-event PWM dynamic image sensor with lossless pixel-level video compression , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[47] Bernabé Linares-Barranco,et al. A Five-Decade Dynamic-Range Ambient-Light-Independent Calibrated Signed-Spatial-Contrast AER Retina With 0.1-ms Latency and Optional Time-to-First-Spike Mode , 2010, IEEE Transactions on Circuits and Systems I: Regular Papers.

[48] Bernabé Linares-Barranco,et al. A 3.6 $\mu$ s Latency Asynchronous Frame-Free Event-Driven Dynamic-Vision-Sensor , 2011, IEEE Journal of Solid-State Circuits.

[49] Bernabé Linares-Barranco,et al. Voltage mode driver for low power transmission of high speed serial AER Links , 2011, 2011 IEEE International Symposium of Circuits and Systems (ISCAS).

[50] Bernabé Linares-Barranco,et al. An Instant-Startup Jitter-Tolerant Manchester-Encoding Serializer/Deserializer Scheme for Event-Driven Bit-Serial LVDS Interchip AER Links , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.

[51] Bernabé Linares-Barranco,et al. A 32$\,\times\,$ 32 Pixel Convolution Processor Chip for Address Event Vision Sensors With 155 ns Event Latency and 20 Meps Throughput , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.

[52] Bernabé Linares-Barranco,et al. An Event-Driven Multi-Kernel Convolution Processor Module for Event-Driven Vision Sensors , 2012, IEEE Journal of Solid-State Circuits.

[53] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[54] Bernabé Linares-Barranco,et al. Multicasting Mesh AER: A Scalable Assembly Approach for Reconfigurable Neuromorphic Structured AER Systems. Application to ConvNets , 2013, IEEE Transactions on Biomedical Circuits and Systems.