Compete to Compute

Local competition among neighboring neurons is common in biological neural networks (NNs). In this paper, we apply the concept to gradient-based, backprop-trained artificial multilayer NNs. NNs with competing linear units tend to outperform those with non-competing nonlinear units, and avoid catastrophic forgetting when training sets change over time.

[1]  Professor Dr. John C. Eccles,et al.  The Cerebellum as a Neuronal Machine , 1967, Springer Berlin Heidelberg.

[2]  T. Lømo,et al.  Participation of inhibitory and excitatory interneurones in the control of hippocampal cortical output. , 1969, UCLA forum in medical sciences.

[3]  C. Stefanis Interneuronal mechanisms in the cortex. , 1969, UCLA forum in medical sciences.

[4]  D. C. Higgins The Interneuron , 1970, The Yale Journal of Biology and Medicine.

[5]  S. Grossberg Contour Enhancement , Short Term Memory , and Constancies in Reverberating Neural Networks , 1973 .

[6]  Roman Bek,et al.  Discourse on one way in which a quantum-mechanics language on the classical logical base can be built up , 1978, Kybernetika.

[7]  Stephen Grossberg,et al.  The ART of adaptive pattern recognition by a self-organizing neural network , 1987, Computer.

[8]  John Lazzaro,et al.  Winner-Take-All Networks of O(N) Complexity , 1988, NIPS.

[9]  Jürgen Schmidhuber,et al.  A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .

[10]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[11]  Bard Ermentrout,et al.  Complex dynamics in winner-take-all neural nets with slow inhibition , 1992, Neural Networks.

[12]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Wolfgang Maass,et al.  Neural Computation with Winner-Take-All as the Only Nonlinear Operation , 1999, NIPS.

[15]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[16]  C. Koch,et al.  Attention activates winner-take-all competition among visual filters , 1999, Nature Neuroscience.

[17]  Wolfgang Maass,et al.  On the Computational Power of Winner-Take-All , 2000, Neural Computation.

[18]  Giacomo Indiveri,et al.  Modeling Selective Attention Using a Neuromorphic Analog VLSI Device , 2000, Neural Computation.

[19]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[20]  Robert M. French,et al.  Catastrophic interference in connectionist networks , 2003 .

[21]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[22]  S. Grossberg,et al.  Pattern formation, contrast control, and oscillations in the short term memory of shunting on-center off-surround networks , 1975, Biological Cybernetics.

[23]  C. Malsburg Self-organization of orientation sensitive cells in the striate cortex , 2004, Kybernetik.

[24]  Risto Miikkulainen,et al.  Computational Maps in the Visual Cortex , 2005 .

[25]  Shih-Chii Liu,et al.  Spiking Inputs to a Winner-take-all Network , 2005, NIPS.

[26]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[27]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[28]  Volodymyr Mnih,et al.  CUDAMat: a CUDA-based matrix class for Python , 2009 .

[29]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[31]  Xiaolong Wang,et al.  Active Deep Networks for Semi-Supervised Sentiment Classification , 2010, COLING.

[32]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[33]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[34]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[35]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Paul Nicholas,et al.  Pattern in(formation) , 2012 .

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Rob Fergus,et al.  Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[39]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[40]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[41]  Jürgen Schmidhuber,et al.  First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.

[42]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.