Generative models for discovering sparse distributed representations.

We describe a hierarchical, generative model that can be viewed as a nonlinear generalization of factor analysis and can be implemented in a neural network. The model uses bottom-up, top-down and lateral connections to perform Bayesian perceptual inference correctly. Once perceptual inference has been performed the connection strengths can be updated using a very simple learning rule that only requires locally available information. We demonstrate that the network learns to extract sparse, distributed, hierarchical representations.

[1]  R. Gregory The intelligent eye , 1970 .

[2]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[3]  Berthold K. P. Horn Understanding Image Intensities , 1977, Artif. Intell..

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[6]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[7]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  David Zipser,et al.  Feature discovery by competitive learning , 1986 .

[9]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[10]  D. Rumelhart Learning internal representations by back-propagating errors , 1986 .

[11]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[12]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[13]  Richard Durbin,et al.  An analogue approach to the travelling salesman problem using an elastic net method , 1987, Nature.

[14]  Judea Pearl,et al.  Chapter 2 – BAYESIAN INFERENCE , 1988 .

[15]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[16]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[17]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[18]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[19]  G E Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[20]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[21]  David Mumford,et al.  Neuronal Architectures for Pattern-theoretic Problems , 1995 .

[22]  Terrence J. Sejnowski,et al.  Bayesian Unsupervised Learning of Higher Order Structure , 1996, NIPS.

[23]  Brendan J. Frey,et al.  Continuous Sigmoidal Belief Networks Trained using Slice Sampling , 1996, NIPS.

[24]  H. Sebastian Seung,et al.  Unsupervised Learning by Convex and Conic Coding , 1996, NIPS.

[25]  Christopher M. Bishop,et al.  GTM: A Principled Alternative to the Self-Organizing Map , 1996, NIPS.

[26]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[27]  Peter Dayan,et al.  Factor Analysis Using Delta-Rule Wake-Sleep Learning , 1997, Neural Computation.

[28]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 1999, Encyclopedia of Machine Learning and Data Mining.