A Learning Algorithm for Boltzmann Machines

The computational power of massively parallel networks of simple processing elements resides in the communication bandwidth provided by the hardware connections between elements. These connections can allow a significant fraction of the knowledge of the system to be applied to an instance of a problem in a very short time. One kind of computation for which massively parallel networks appear to be well suited is large constraint satisfaction searches, but to use the connections efficiently two conditions must be met: First, a search technique that is suitable for parallel networks must be found. Second, there must be some way of choosing internal representations which allow the preexisting hardware connections to be used efficiently for encoding the constraints in the domain being searched. We describe a general parallel search method, based on statistical mechanics, and we show how it leads to a general learning rule for modifying the connection strengths so as to incorporate knowledge about a task domain in an efficient way. We describe some simple examples in which the learning algorithm creates internal representations that are demonstrably the most efficient way of using the preexisting connectivity structure.

[1]  N. Metropolis,et al.  Equation of state calculations by fast computing machines , 1953 .

[2]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .


[4]  Allen Newell,et al.  Human Problem Solving , 1973 .

[5]  David L. Waltz,et al.  Understanding Line drawings of Scenes with Shadows , 1975 .

[6]  Geoffrey E. Hinton Relaxation and its role in vision , 1977 .

[7]  Roger Ratcliff,et al.  A Theory of Memory Retrieval. , 1978 .

[8]  Scott E. Fahlman,et al.  The hashnet interconnection scheme , 1980 .

[9]  David H. Ackley,et al.  The QBKG System: Generating Explanations From a Non-Discrete Knowledge Representation , 1982, AAAI.

[10]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[12]  Allen Newell,et al.  Intellectual issues in the history of artificial intelligence , 1983 .

[13]  Paul Smolensky,et al.  Schema Selection and Stochastic Inference in Modular Environments , 1983, AAAI.

[14]  Geoffrey E. Hinton,et al.  Massively Parallel Architectures for AI: NETL, Thistle, and Boltzmann Machines , 1983, AAAI.

[15]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[16]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  K. Binder Applications of the Monte Carlo Method in Statistical Physics , 1984 .

[18]  C. V. D. Malsburg,et al.  Frank Rosenblatt: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms , 1986 .