Dense Associative Memory for Pattern Recognition

A model of associative memory is studied, which stores and reliably retrieves many more patterns than the number of neurons in the network. We propose a simple duality between this dense associative memory and neural networks commonly used in deep learning. On the associative memory side of this duality, a family of models that smoothly interpolates between two limiting cases can be constructed. One limit is referred to as the feature-matching mode of pattern recognition, and the other one as the prototype regime. On the deep learning side of the duality, this family corresponds to feedforward neural networks with one hidden layer and various activation functions, which transmit the activities of the visible neurons to the hidden layer. This family of activation functions includes logistics, rectified linear units, and rectified polynomials of higher degrees. The proposed duality makes it possible to apply energy-based intuition from associative memory to analyze computational properties of neural networks with unusual activation functions - the higher rectified polynomials which until now have not been used in deep learning. The utility of the dense memories is illustrated for two test cases: the logical gate XOR and the recognition of handwritten digits from the MNIST data set.

[1]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Sompolinsky,et al.  Storing infinite numbers of patterns in a spin-glass model of neural networks. , 1985, Physical review letters.

[3]  Baldi,et al.  Number of stable points for spin-glasses and neural networks of higher orders. , 1987, Physical review letters.

[4]  E. Gardner Multiconnected neural network models , 1987 .

[5]  C. L. Giles,et al.  High order correlation model for associative memory , 1987 .

[6]  Demetri Psaltis,et al.  Nonlinear discriminant functions and associative memories , 1987 .

[7]  Abbott,et al.  Storage capacity of generalized networks. , 1987, Physical review. A, General physics.

[8]  Santosh S. Venkatesh,et al.  The capacity of the Hopfield associative memory , 1987, IEEE Trans. Inf. Theory.

[9]  Kanter,et al.  Associative recall of memory without errors. , 1987, Physical review. A, General physics.

[10]  M. Usher,et al.  Capacities of multiconnected memory models , 1988 .

[11]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[12]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[13]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[14]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[15]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[16]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[19]  Roland Memisevic,et al.  On autoencoder scoring , 2013, ICML.

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  Roland Memisevic,et al.  The Potential Energy of an Autoencoder , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[23]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24]  Michelle Becker,et al.  Perceptrons An Introduction To Computational Geometry , 2016 .