An information-theoretic optimization principle ('infomax') has previously been used for unsupervised learning of statistical regularities in an input ensemble. The principle states that the input-output mapping implemented by a processing stage should be chosen so as to maximize the average mutual information between input and output patterns, subject to constraints and in the presence of processing noise. In the present work I show how infomax, when applied to a class of nonlinear input-output mappings, can under certain conditions generate optimal filters that have additional useful properties: (1) Output activity (for each input pattern) tends to be concentrated among a relatively small number of nodes. (2) The filters are sensitive to higher-order statistical structure (beyond pairwise correlations). If the input features are localized, the filters' receptive fields tend to be localized as well. (3) Multiresolution sets of filters with subsampling at low spatial frequencies - related to pyramid coding and wavelet representations - emerge as favored solutions for certain types of input ensembles.
[1]
Ralph Linsker,et al.
Self-organization in a perceptual network
,
1988,
Computer.
[2]
Ralph Linsker,et al.
An Application of the Principle of Maximum Information Preservation to Linear Systems
,
1988,
NIPS.
[3]
David J. Field,et al.
What The Statistics Of Natural Images Tell Us About Visual Coding
,
1989,
Photonics West - Lasers and Applications in Science and Engineering.
[4]
M. V. Rossum,et al.
In Neural Computation
,
2022
.
[5]
Joseph J. Atick,et al.
Towards a Theory of Early Visual Processing
,
1990,
Neural Computation.
[6]
Ralph Linsker,et al.
Local Synaptic Learning Rules Suffice to Maximize Mutual Information in a Linear Network
,
1992,
Neural Computation.
[7]
Nathan Intrator,et al.
Feature Extraction Using an Unsupervised Neural Network
,
1992,
Neural Computation.