Learning and Invariance in a Family of Hierarchical Kernels

Understanding invariance and discrimination properties of hierarchical models is arguably the key to understanding how and why such models, of which the the mammalian visual system is one instance, can lead to good generalization properties and reduce the sample complexity of a given learning task. In this paper we explore invariance to transformation and the role of layerwise embeddings within an abstract framework of hierarchical kernels motivated by the visual cortex. Here a novel form of invariance is induced by propagating the effect of locally defined, invariant kernels throughout a hierarchy. We study this notion of invariance empirically. We then present an extension of the abstract hierarchical modeling framework to incorporate layer-wise embeddings, which we demonstrate can lead to improved generalization and scalable algorithms. Finally we analyze experimentally sample complexity properties as a function of architectural parameters.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[4]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[5]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[6]  Heiko Wersing,et al.  Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[7]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[8]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[9]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[10]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Thomas Serre,et al.  A feedforward architecture accounts for rapid categorization , 2007, Proceedings of the National Academy of Sciences.

[12]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Tomaso Poggio,et al.  Trade-Off between Object Selectivity and Tolerance in Monkey Inferotemporal Cortex , 2007, The Journal of Neuroscience.

[14]  Geoffrey E. Hinton Reducing the Dimensionality of Data with Neural , 2008 .

[15]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[16]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[18]  Lorenzo Rosasco,et al.  Publisher Accessed Terms of Use Detailed Terms Mathematics of the Neural Response , 2022 .

[19]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[20]  Priati,et al.  Probabilistic Grammars and Their Applications , .