Does invariant recognition predict tuning of neurons in sensory cortex ?

Tuning properties of simple cells in cortical V1 can be described in terms of a “universal shape” characterized by parameter values which hold across different species ([12], [33], [22]). This puzzling set of findings begs for a general explanation grounded on an evolutionarily important computational function of the visual cortex. We ask here whether these properties are predicted by the hypothesis that the goal of the ventral stream is to compute for each image a “signature” vector which is invariant to geometric transformations as postulated in [30] – with the the additional assumption that the mechanism for continuously learning and maintaining invariance consists of the memory storage of a sequence of neural images of a few objects undergoing transformations (such as translation, scale changes and rotation) via Hebbian synapses. For V1 simple cells the simplest version of this hypothesis is the online Oja rule which implies that the tuning of neurons converges to the eigenvectors of the covariance of their input. Starting with a set of dendritic fields spanning a range of sizes, simulations supported by a direct mathematical analysis show that the solution of the associated “cortical equation” provides a set of Gabor-like wavelets with parameter values that are in broad agreement with the physiology data. We show however that the simple version of the Hebbian assumption does not predict all the physiological properties. The same theoretical framework also provides predictions about the tuning of cells in V4 and in the face patch AL [17] which are in qualitative agreement with physiology data. 1 Computational goal and mechanism The original work of Hubel and Wiesel, as well as subsequent research, left open the questions of (1) what is the function of the ventral stream in visual cortex and (2) how are the properties of its neurons related to it. Poggio et al.’s so-called “magic” theory [29, 30] – here “M-theory” – proposes

[1]  O. L. Z. Book Review: The Organization of Behaviour: A Neuropsychological Theory , 1950 .

[2]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[3]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[4]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[5]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[6]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[7]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[8]  C. Shatz,et al.  Transient period of correlated bursting activity during development of the mammalian retina , 1993, Neuron.

[9]  L. Croner,et al.  Receptive fields of P and M ganglion cells across the primate retina , 1995, Vision Research.

[10]  D. C. Essen,et al.  Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. , 1996, Journal of neurophysiology.

[11]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[12]  R C Reid,et al.  Efficient Coding of Natural Scenes in the Lateral Geniculate Nucleus: Experimental Test of a Computational Theory , 1996, The Journal of Neuroscience.

[13]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[14]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[15]  Erkki Oja,et al.  Independent component analysis by general nonlinear Hebbian-like learning rules , 1998, Signal Process..

[16]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[17]  M. A. Repucci,et al.  Spatial Structure and Symmetry of Simple-Cell Receptive Fields in Macaque Primary Visual Cortex , 2002 .

[18]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[19]  S. Nelson,et al.  Homeostatic plasticity in the developing nervous system , 2004, Nature Reviews Neuroscience.

[20]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[21]  Charles F Stevens Preserving properties of object shape by computations in primary visual cortex. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Martin Rehn,et al.  A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields , 2007, Journal of Computational Neuroscience.

[23]  David G. Lowe,et al.  University of British Columbia. , 1945, Canadian Medical Association journal.

[24]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  W. M. Keck,et al.  Highly Selective Receptive Fields in Mouse Visual Cortex , 2008, The Journal of Neuroscience.

[26]  S. Gerber,et al.  Unsupervised Natural Experience Rapidly Alters Invariant Object Representation in Visual Cortex , 2008 .

[27]  Bruno A. Olshausen,et al.  Learning real and complex overcomplete representations from the statistics of natural images , 2009, Optical Engineering + Applications.

[28]  Nicolas Pinto,et al.  How far can you get with a modern face recognition test set using only simple features? , 2009, CVPR.

[29]  Doris Y. Tsao,et al.  Functional Compartmentalization and Viewpoint Generalization Within the Macaque Face-Processing System , 2010, Science.

[30]  Andrew Y. Ng,et al.  Unsupervised learning models of primary cortical receptive fields and receptive field plasticity , 2011, NIPS.

[31]  Michael Robert DeWeese,et al.  A Sparse Coding Model with Synaptically Local Plasticity and Spiking Neurons Can Account for the Diverse Shapes of V1 Simple Cell Receptive Fields , 2011, PLoS Comput. Biol..

[32]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[33]  Joel Z. Leibo,et al.  How can cells in the anterior medial face patch be viewpoint invariant , 2011 .

[34]  Gerald Penn,et al.  Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Joel Z. Leibo,et al.  Learning and disrupting invariance in visual recognition with a temporal association rule , 2011, Front. Comput. Neurosci..

[36]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[37]  Lorenzo Rosasco,et al.  The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work). , 2012 .