Unsupervised learning by competing hidden units

Significance Despite great success of deep learning a question remains to what extent the computational properties of deep neural networks are similar to those of the human brain. The particularly nonbiological aspect of deep learning is the supervised training process with the backpropagation algorithm, which requires massive amounts of labeled data, and a nonlocal learning rule for changing the synapse strengths. This paper describes a learning algorithm that does not suffer from these two problems. It learns the weights of the lower layer of neural networks in a completely unsupervised fashion. The entire algorithm utilizes local learning rules which have conceptual biological plausibility. It is widely believed that end-to-end training with the backpropagation algorithm is essential for learning good feature detectors in early layers of artificial neural networks, so that these detectors are useful for the task performed by the higher layers of that neural network. At the same time, the traditional form of backpropagation is biologically implausible. In the present paper we propose an unusual learning rule, which has a degree of biological plausibility and which is motivated by Hebb’s idea that change of the synapse strength should be local—i.e., should depend only on the activities of the pre- and postsynaptic neurons. We design a learning algorithm that utilizes global inhibition in the hidden layer and is capable of learning early feature detectors in a completely unsupervised way. These learned lower-layer feature detectors can be used to train higher-layer weights in a usual supervised way so that the performance of the full network is comparable to the performance of standard feedforward networks trained end-to-end with a backpropagation algorithm on simple tasks.

[1]  J. Eccles From electrical to chemical transmission in the central nervous system: The closing address of the Sir Henry Dale Centennial Symposium Cambridge, 19 September 1975 , 1976, Notes and Records of the Royal Society of London.

[2]  Roman Bek,et al.  Discourse on one way in which a quantum-mechanics language on the classical logical base can be built up , 1978, Kybernetika.

[3]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[4]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[5]  David Zipser,et al.  Feature Discovery by Competive Learning , 1985, Cogn. Sci..

[6]  R Linsker,et al.  From basic network principles to neural architecture: emergence of orientation columns. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[7]  R Linsker,et al.  From basic network principles to neural architecture: emergence of orientation-selective cells. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R Linsker,et al.  From basic network principles to neural architecture: emergence of spatial-opponent cells. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[10]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[11]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[12]  J. Hopfield,et al.  All-or-none potentiation at CA3-CA1 synapses. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Christof Koch,et al.  Biophysics of Computation: Information Processing in Single Neurons (Computational Neuroscience Series) , 1998 .

[14]  P. Földiák,et al.  Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.

[15]  C. Malsburg Self-organization of orientation sensitive cells in the striate cortex , 2004, Kybernetik.

[16]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[17]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[18]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[19]  Dmitri B. Chklovskii,et al.  A Hebbian/Anti-Hebbian network derived from online non-negative matrix factorization can cluster and discover sparse features , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[20]  Daniel Cownden,et al.  Random feedback weights support learning in deep neural networks , 2014, ArXiv.

[21]  Wulfram Gerstner,et al.  Neuronal Dynamics: From Single Neurons To Networks And Models Of Cognition , 2014 .

[22]  Tao Hu,et al.  A Hebbian/Anti-Hebbian Neural Network for Linear Subspace Learning: A Derivation from Multidimensional Scaling of Streaming Data , 2015, Neural Computation.

[23]  Matthew Cook,et al.  Unsupervised learning of digit recognition using spike-timing-dependent plasticity , 2015, Front. Comput. Neurosci..

[24]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[25]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[26]  L. Luo Principles of Neurobiology , 2015 .

[27]  Yoshua Bengio,et al.  Difference Target Propagation , 2014, ECML/PKDD.

[28]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[29]  Arild Nøkland,et al.  Direct Feedback Alignment Provides Learning in Deep Neural Networks , 2016, NIPS.

[30]  John J. Hopfield,et al.  Dense Associative Memory for Pattern Recognition , 2016, NIPS.

[31]  Yoshua Bengio,et al.  Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation , 2016, Front. Comput. Neurosci..

[32]  H. Sebastian Seung,et al.  A correlation game for unsupervised learning yields computational interpretations of Hebbian excitation, anti-Hebbian inhibition, and synapse elimination , 2017, ArXiv.

[33]  Rafal Bogacz,et al.  An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity , 2017, Neural Computation.

[34]  Daniel Kifer,et al.  Conducting Credit Assignment by Aligning Local Representations , 2018, 1803.01834.

[35]  Anirvan M. Sengupta,et al.  Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks , 2018, bioRxiv.

[36]  John J. Hopfield,et al.  Dense Associative Memory Is Robust to Adversarial Inputs , 2017, Neural Computation.

[37]  Brian Kingsbury,et al.  Beyond Backprop: Alternating Minimization with co-Activation Memory , 2018, ArXiv.

[38]  Anirvan M. Sengupta,et al.  Why Do Similarity Matching Objectives Lead to Hebbian/Anti-Hebbian Networks? , 2017, Neural Computation.

[39]  Geoffrey E. Hinton,et al.  Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures , 2018, NeurIPS.

[40]  Yoshua Bengio,et al.  Dendritic error backpropagation in deep cortical microcircuits , 2017, ArXiv.

[41]  Alexander Ororbia,et al.  Biologically Motivated Algorithms for Propagating Local Target Representations , 2018, AAAI.