Learning Transformational Invariants from Time-Varying Natural Images

How does the brain represent and learn the structure contained in dynamic visual scenes? Previous work using unsupervised learning in the time domain has shown that sparse coding can uncover direction-selective components that are tuned to specific spatial and temporal frequency bands. However, these models do not capture more complex motion selectivity such as speed tuning or pattern motion, which are response characteristics found in higher visual area MT.

[1]  Michael S. Lewicki,et al.  A Hierarchical Bayesian Model for Learning Nonlinear Statistical Regularities in Nonstationary Natural Signals , 2005, Neural Computation.

[2]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Rajesh P. N. Rao,et al.  Bilinear Sparse Coding for Invariant Vision , 2005, Neural Computation.

[4]  David J. Fleet,et al.  Computation of component image velocity from local phase information , 1990, International Journal of Computer Vision.

[5]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[6]  Christoph Kayser,et al.  Learning the invariance properties of complex cells from their responses to natural stimuli , 2002, The European journal of neuroscience.

[7]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[8]  Eero P. Simoncelli,et al.  A model of neuronal responses in visual area MT , 1998, Vision Research.

[9]  Peter Földiák,et al.  Learning Invariance from Transformation Sequences , 1991, Neural Comput..

[10]  T. Sejnowski,et al.  A selection model for motion processing in area MT of primates , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[11]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[12]  Kechen Zhang,et al.  Emergence of Position-Independent Detectors of Sense of Rotation and Dilation with Hebbian Learning: An Analysis , 1999, Neural Computation.

[13]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[14]  Edmund T. Rolls,et al.  Invariant Global Motion Recognition in the Dorsal Visual System: A Unifying Theory , 2007, Neural Computation.

[15]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[17]  Aapo Hyvärinen,et al.  Bubbles: a unifying framework for low-level statistical properties of natural image sequences. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[18]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[19]  Gerhard Krieger,et al.  The atoms of vision: Cartesian or polar? , 1999 .

[20]  E. Adelson,et al.  The analysis of moving visual patterns , 1985 .

[21]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.