Local Synaptic Learning Rules Suffice to Maximize Mutual Information in a Linear Network

A network that develops to maximize the mutual information between its output and the signal portion of its input (which is admixed with noise) is useful for extracting salient input features, and may provide a model for aspects of biological neural network function. I describe a local synaptic Learning rule that performs stochastic gradient ascent in this information-theoretic quantity, for the case in which the input-output mapping is linear and the input signal and noise are multivariate gaussian. Feedforward connection strengths are modified by a Hebbian rule during a "learning" phase in which examples of input signal plus noise are presented to the network, and by an anti-Hebbian rule during an "unlearning" phase in which examples of noise alone are presented. Each recurrent lateral connection has two values of connection strength, one for each phase; these values are updated by an anti-Hebbian rule.

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[3]  Francis Crick,et al.  The function of dream sleep , 1983, Nature.

[4]  P. Anandan,et al.  Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Barak A. Pearlmutter,et al.  G-maximization: An unsupervised learning procedure for discovering regularities , 1987 .

[6]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[7]  Terence D. Sanger,et al.  An Optimality Principle for Unsupervised Learning , 1988, NIPS.

[8]  P. Switzer,et al.  A transformation for ordering multispectral data in terms of image quality with implications for noise removal , 1988 .

[9]  Ralph Linsker,et al.  An Application of the Principle of Maximum Information Preservation to Linear Systems , 1988, NIPS.

[10]  P. Foldiak,et al.  Adaptive network for optimal linear feature extraction , 1989, International 1989 Joint Conference on Neural Networks.

[11]  Ralph Linsker,et al.  How to Generate Ordered Maps by Maximizing the Mutual Information between Input and Output Signals , 1989, Neural Computation.

[12]  Joseph J. Atick,et al.  Towards a Theory of Early Visual Processing , 1990, Neural Computation.

[13]  Joseph J. Atick,et al.  Predicting Ganglion and Simple Cell Receptive Field Organizations , 1991, Int. J. Neural Syst..

[14]  T. Leen Dynamics of learning in linear feature-discovery networks , 1991 .