Information geometry of Boltzmann machines

A Boltzmann machine is a network of stochastic neurons. The set of all the Boltzmann machines with a fixed topology forms a geometric manifold of high dimension, where modifiable synaptic weights of connections play the role of a coordinate system to specify networks. A learning trajectory, for example, is a curve in this manifold. It is important to study the geometry of the neural manifold, rather than the behavior of a single network, in order to know the capabilities and limitations of neural networks of a fixed topology. Using the new theory of information geometry, a natural invariant Riemannian metric and a dual pair of affine connections on the Boltzmann neural network manifold are established. The meaning of geometrical structures is elucidated from the stochastic and the statistical point of view. This leads to a natural modification of the Boltzmann machine learning rule.

[1]  S. Amari Differential Geometry of Curved Exponential Families-Curvatures and Information Loss , 1982 .

[2]  S. Amari,et al.  Geometrical theory of higher-order asymptotics of test, interval estimator and conditional inference , 1983, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[3]  Shun-ichi Amari,et al.  Differential geometry of statistical inference , 1983 .

[4]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  D. Cox,et al.  The role of differential geometry in statistical theory , 1986 .

[6]  Terrence J. Sejnowski,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cognitive Sciences.

[7]  S. Amari,et al.  Estimation in the Presence of Infinitely many Nuisance Parameters--Geometry of Estimating Functions , 1988 .

[8]  Esther Levin,et al.  A statistical approach to learning and generalization in layered neural networks , 1989, Proc. IEEE.

[9]  Shun-ichi Amari,et al.  Statistical inference under multiterminal rate restrictions: A differential geometric approach , 1989, IEEE Trans. Inf. Theory.

[10]  R. Kass The Geometry of Asymptotic Inference , 1989 .

[11]  S. Amari Fisher information under restriction of Shannon information in multi-terminal situations , 1989 .

[12]  Shun-ichi Amari,et al.  Mathematical foundations of neurocomputing , 1990, Proc. IEEE.

[13]  S. Amari,et al.  Asymptotic theory of sequential estimation : Differential geometrical approach , 1991 .

[14]  Shun-ichi Amari,et al.  Dualistic geometry of the manifold of higher-order neurons , 1991, Neural Networks.

[15]  Sompolinsky,et al.  Statistical mechanics of learning from examples. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[16]  Shun-ichi Amari,et al.  Four Types of Learning Curves , 1992, Neural Computation.

[17]  C. R. Rao,et al.  Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .