A feedforward architecture accounts for rapid categorization

Primates are remarkably good at recognizing objects. The level of performance of their visual system and its robustness to image degradations still surpasses the best computer vision systems despite decades of engineering effort. In particular, the high accuracy of primates in ultra rapid object categorization and rapid serial visual presentation tasks is remarkable. Given the number of processing stages involved and typical neural latencies, such rapid visual processing is likely to be mostly feedforward. Here we show that a specific implementation of a class of feedforward theories of object recognition (that extend the Hubel and Wiesel simple-to-complex cell hierarchy and account for many anatomical and physiological constraints) can predict the level and the pattern of performance achieved by humans on a rapid masked animal vs. non-animal categorization task.

[1]  A. H. Taub,et al.  Studies In Applied Mathematics , 1971 .

[2]  P. Schiller,et al.  Quantitative studies of single-cell properties in monkey striate cortex. I. Spatiotemporal organization of receptive fields. , 1976, Journal of neurophysiology.

[3]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[4]  A G Barto,et al.  Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[5]  R. Mansfield,et al.  Analysis of visual behavior , 1982 .

[6]  D. G. Albrecht,et al.  Spatial frequency selectivity of cells in macaque visual cortex , 1982, Vision Research.

[7]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[8]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[9]  Neil A. Macmillan,et al.  Detection Theory: A User's Guide , 1991 .

[10]  P. Prioreschi A history of medicine: primitive and ancient medicine. , 1991, Mellen history of medicine.

[11]  P. Laurberg,et al.  Hyperfunctioning thyroid nodules. , 1991, Thyroidology.

[12]  D I Perrett,et al.  Organization and functions of cells responsive to faces in the temporal cortex. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[13]  Leslie G. Ungerleider,et al.  The modular organization of projections from areas V1 and V2 to areas V4 and TEO in macaques , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14]  David I. Perrett,et al.  Neurophysiology of shape processing , 1993, Image Vis. Comput..

[15]  Keiji Tanaka,et al.  Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. , 1994, Journal of neurophysiology.

[16]  A. Oliva,et al.  From Blobs to Boundary Edges: Evidence for Time- and Spatial-Scale-Dependent Scene Recognition , 1994 .

[17]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[18]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[19]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[20]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[21]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[22]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[23]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[24]  J. Wolfe,et al.  Preattentive Object Files: Shapeless Bundles of Basic Features , 1997, Vision Research.

[25]  D G Pelli,et al.  The VideoToolbox software for visual psychophysics: transforming numbers into movies. , 1997, Spatial vision.

[26]  C. Gross Brain, Vision, Memory: Tales in the History of Neuroscience , 1998 .

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Pieter R. Roelfsema,et al.  Object-based attention in the primary visual cortex of the macaque monkey , 1998, Nature.

[29]  V. Paxson,et al.  Notices of the American Mathematical Society , 1998 .

[30]  R. Desimone,et al.  Competitive Mechanisms Subserve Attention in Macaque Areas V2 and V4 , 1999, The Journal of Neuroscience.

[31]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[32]  Victor A. F. Lamme,et al.  The implementation of visual routines , 2000, Vision Research.

[33]  V. Lamme,et al.  The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.

[34]  J. Enns,et al.  What’s new in visual masking? , 2000, Trends in Cognitive Sciences.

[35]  C. Connor,et al.  Shape representation in area V4: position-specific tuning for boundary conformation. , 2001, Journal of neurophysiology.

[36]  D. Wilkin,et al.  Neuron , 2001, Brain Research.

[37]  S. Thorpe,et al.  Seeking Categories in the Brain , 2001, Science.

[38]  David J. Freedman,et al.  Categorical representation of visual stimuli in the primate prefrontal cortex. , 2001, Science.

[39]  S. Hochstein,et al.  View from the Top Hierarchies and Reverse Hierarchies in the Visual System , 2002, Neuron.

[40]  Koby Crammer,et al.  Advances in Neural Information Processing Systems 14 , 2002 .

[41]  Heinrich H. Bülthoff,et al.  Biologically motivated computer vision : Second International Workshop, BMCV 2002, Tübingen, Germany, November 22-24, 2002 : proceedings , 2002 .

[42]  Antonio Torralba,et al.  Depth Estimation from Image Structure , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Michel Vidal-Naquet,et al.  Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[44]  T. Gawne,et al.  Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. , 2002, Journal of neurophysiology.

[45]  Seong-Whan Lee,et al.  Biologically Motivated Computer Vision , 2002, Lecture Notes in Computer Science.

[46]  H. Spekreijse,et al.  Masking Interrupts Figure-Ground Signals in V1 , 2002, Journal of Cognitive Neuroscience.

[47]  G. Rousselet,et al.  Is it an animal? Is it a human face? Fast processing in upright and inverted natural scenes. , 2003, Journal of vision.

[48]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[49]  Y. Amit,et al.  An integrated network for invariant visual detection and recognition , 2003, Vision Research.

[50]  C. Koch,et al.  Visual Selective Behavior Can Be Triggered by a Feed-Forward Process , 2003, Journal of Cognitive Neuroscience.

[51]  Antonio Torralba,et al.  Statistics of natural image categories , 2003, Network.

[52]  Heiko Wersing,et al.  Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[53]  Tomaso Poggio,et al.  Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. , 2004, Journal of neurophysiology.

[54]  Thomas Serre,et al.  Realistic Modeling of Simple and Complex Cell Tuning in the HMAX Model, and Implications for Invariant Object Recognition in Cortex , 2004 .

[55]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[56]  S. Thorpe,et al.  The time course of visual processing: Backward masking and natural scene categorisation , 2005, Vision Research.

[57]  A. Treisman,et al.  Perception of objects in natural scenes: is it really attention free? , 2005, Journal of experimental psychology. Human perception and performance.

[58]  Tomaso Poggio,et al.  Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[59]  Thomas Serre,et al.  A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[60]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.