Information processing by a perceptron in an unsupervised learning task

We study the ability of a simple neural network (a perceptron architecture, no hidden units, binary outputs) to process information in the context of an unsupervised learning task. The network is asked to provide the best possible neural representation of a given input distribution, according to some criterion taken from information theory. The authors compare various optimization criteria that have been proposed: maximum information transmission, minimum redundancy and closeness to factorial code. They show that for the perceptron one can compute the maximum information that the code (the output neural representation) can convey about the input. They show that one can use statistical mechanics techniques, such as replica techniques, to compute the typical mutual information between input and output distributions. More precisely, for a Gaussian input source with a given correlation matrix, they compute the typical mutual information when the couplings are chosen randomly. They determine the correlations b...

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[3]  Joseph J. Atick,et al.  Towards a Theory of Early Visual Processing , 1990, Neural Computation.

[4]  Rémi Monasson,et al.  Properties of neural networks storing spatially correlated patterns , 1992 .

[5]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[6]  Zee,et al.  Understanding the efficiency of human perception. , 1988, Physical review letters.

[7]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[8]  E. Gardner The space of interactions in neural network models , 1988 .

[9]  Claude E. Shannon,et al.  The Mathematical Theory of Communication , 1950 .

[10]  Joachim M. Buhmann,et al.  Complexity Optimized Data Clustering by Competitive Neural Networks , 1993, Neural Computation.

[11]  Richard E. Blahut,et al.  Principles and practice of information theory , 1987 .

[12]  E. Bienenstock,et al.  Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex , 1982, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[13]  H. Barlow Cerebral Cortex as Model Builder , 1987 .

[14]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[15]  F. G. SMITH Frequencies for Radio Astronomy , 1970, Nature.

[16]  Erkki Oja,et al.  Neural Networks, Principal Components, and Subspaces , 1989, Int. J. Neural Syst..

[17]  G. F. Cooper,et al.  Development of the Brain depends on the Visual Environment , 1970, Nature.

[18]  Néstor Parga,et al.  Duality Between Learning Machines: A Bridge Between Supervised and Unsupervised Learning , 1994, Neural Computation.

[19]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[20]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[21]  H. B. Barlow,et al.  Finding Minimum Entropy Codes , 1989, Neural Computation.

[22]  Joseph J. Atick,et al.  What Does the Retina Know about Natural Scenes? , 1992, Neural Computation.