论文信息 - Learning in Boltzmann Trees

Learning in Boltzmann Trees

We introduce a large family of Boltzmann machines that can be trained by standard gradient descent. The networks can have one or more layers of hidden units, with tree-like connectivity. We show how to implement the supervised learning algorithm for these Boltzmann machines exactly, without resort to simulated or mean-field annealing. The stochastic averages that yield the gradients in weight space are computed by the technique of decimation. We present results on the problems of N-bit parity and the detection of hidden symmetries.

Michael I. Jordan | Lawrence K. Saul | L. Saul

[1] Franklin A. Graybill,et al. Introduction to The theory , 1974 .

[2] T. P. Eggarter. Cayley trees, the Ising problem, and the thermodynamic limit , 1974 .

[3] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[4] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[5] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[6] Geoffrey E. Hinton,et al. Learning symmetry groups with hidden units: beyond the perceptron , 1986 .

[7] Judea Pearl,et al. Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[8] Carsten Peterson,et al. A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[9] J J Hopfield,et al. Learning algorithms and probability distributions in feed-forward and feed-back networks. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[10] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[11] Allen Gersho,et al. The Boltzmann Perceptron Network: A Multi-Layered Feed-Forward Network Equivalent to the Boltzmann Machine , 1988, NIPS.

[12] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .

[13] Geoffrey E. Hinton. Deterministic Boltzmann Learning Performs Steepest Descent in Weight-Space , 1989, Neural Computation.

[14] C. Itzykson,et al. Statistical Field Theory , 1989 .

[15] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[16] William H. Press,et al. Numerical recipes , 1990 .

[17] M. Møller. A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1990 .

[18] Marcus Frean,et al. The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[19] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.

[20] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[21] Martin Fodslette Møller,et al. A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[22] C. Galland. The limitations of deterministic Boltzmann machine learning , 1993 .

[23] R. Palmer,et al. Introduction to the theory of neural computation , 1994, The advanced book program.