Recurrent Sampling Models for the Helmholtz Machine

Many recent analysis-by-synthesis density estimation models of cortical learning and processing have made the crucial simplifying assumption that units within a single layer are mutually independent given the states of units in the layer below or the layer above. In this article, we suggest using either a Markov random field or an alternative stochastic sampling architecture to capture explicitly particular forms of dependence within each layer. We develop the architectures in the context of real and binary Helmholtz machines. Recurrent sampling can be used to capture correlations within layers in the generative or the recognition models, and we also show how these can be combined.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[3]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[4]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[6]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[7]  Kevan A. C. Martin,et al.  A Canonical Microcircuit for Neocortex , 1989, Neural Computation.

[8]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[9]  R. Douglas,et al.  A functional microcircuit for cat visual cortex. , 1991, The Journal of physiology.

[10]  José L. Marroquín,et al.  Stochastic cellular automata with Gibbsian invariant measures , 1991, IEEE Trans. Inf. Theory.

[11]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[12]  A. Burkhalter Development of forward and feedback connections between areas V1 and V2 of human visual cortex. , 1993, Cerebral cortex.

[13]  C D Gilbert,et al.  Circuitry, architecture, and functional dynamics of visual cortex. , 1993, Cerebral cortex.

[14]  R. Zemel A minimum description length framework for unsupervised learning , 1994 .

[15]  Gerald Sommer,et al.  Pattern Recognition by Self-Organizing Neural Networks , 1994 .

[16]  Rajesh P. N. Rao,et al.  Dynamic Model of Visual Memory Predicts Neural Response Properties in the Visual Cortex , 1995 .

[17]  M. Hasselmo Neuromodulation and cortical function: modeling the physiological basis of behavior , 1995, Behavioural Brain Research.

[18]  Geoffrey E. Hinton,et al.  Learning Population Codes by Minimizing Description Length , 1993, Neural Computation.

[19]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[20]  R. Zemel,et al.  Learning sparse multiple cause models , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[21]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[22]  Brendan J. Frey,et al.  Does the Wake-sleep Algorithm Produce Good Density Estimators? , 1995, NIPS.

[23]  David Mumford,et al.  Neuronal Architectures for Pattern-theoretic Problems , 1995 .

[24]  Eric Saund,et al.  A Multiple Cause Mixture Model for Unsupervised Learning , 1995, Neural Computation.

[25]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[26]  D. Fitzpatrick The functional organization of local circuits in visual cortex: insights from the study of tree shrew striate cortex. , 1996, Cerebral cortex.

[27]  J. B. Levitt,et al.  Anatomical substrates for early stages in cortical processing of visual information in the macaque monkey , 1996, Behavioural Brain Research.

[28]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[29]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[30]  S. Ullman High-Level Vision: Object Recognition and Visual Cognition , 1996 .

[31]  M. Hasselmo,et al.  Suppression of synaptic transmission may allow combination of associative feedback and self-organizing feedforward connections in the neocortex , 1996, Behavioural Brain Research.

[32]  Peter Dayan,et al.  Factor Analysis Using Delta-Rule Wake-Sleep Learning , 1997, Neural Computation.

[33]  Michael I. Jordan,et al.  Variational methods for inference and estimation in graphical models , 1997 .

[34]  Geoffrey E. Hinton,et al.  Hierarchical Non-linear Factor Analysis and Topographic Maps , 1997, NIPS.

[35]  Geoffrey E. Hinton,et al.  Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[36]  Brendan J. Frey,et al.  Bayesian networks for pattern classification, data compression, and channel coding , 1997 .

[37]  Peter Dayan,et al.  Recognition in hierarchical models , 1997 .

[38]  Michael I. Jordan,et al.  A Mean Field Learning Algorithm for Unsupervised Neural Networks , 1999, Learning in Graphical Models.

[39]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.