Information Geometry and Manifolds of Neural Networks

A manifold of probability distributions naturally possesses a dual pair of affine connections together with the Riemannian metric induced by Fisher information. Such a dual geometrical structure plays a fundamental role in various fields of information sciences such as statistical inference, systems theory, information theory and also stochastic theory of neural networks. This is also a key concept connecting statistical inference and statistical physics. Information geometry is introduced in an intuitive manner and it is applied to elucidate the family of probability distributions realized by stochastic neural networks called the higher-order Boltzmann machines.

[1]  O. Barndorff-Nielsen Parametric statistical models and likelihood , 1988 .

[2]  Shun-ichi Amari,et al.  Dualistic geometry of the manifold of higher-order neurons , 1991, Neural Networks.

[3]  P. Vos Fundamental equations for statistical submanifolds with applications to the Bartlett correction , 1989 .

[4]  B. Efron Defining the Curvature of a Statistical Problem (with Applications to Second Order Efficiency) , 1975 .

[5]  Shun-ichi Amari,et al.  Differential geometry of statistical inference , 1983 .

[6]  S. Amari Fisher information under restriction of Shannon information in multi-terminal situations , 1989 .

[7]  R. Kass The Geometry of Asymptotic Inference , 1989 .

[8]  Shun-ichi Amari,et al.  Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[9]  C. R. Rao,et al.  Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .

[10]  Takashi Kurose Dual connections and affine geometry , 1990 .

[11]  S. Amari,et al.  Asymptotic theory of sequential estimation : Differential geometrical approach , 1991 .

[12]  Ulrich Pinkall,et al.  On the geometry of affine immersions , 1987 .

[13]  S. Amari,et al.  Geometrical theory of higher-order asymptotics of test, interval estimator and conditional inference , 1983, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[14]  D. Cox,et al.  The role of differential geometry in statistical theory , 1986 .

[15]  R. Balian,et al.  Dissipation in many-body systems: A geometric approach based on information theory , 1986 .

[16]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[17]  R. Silver Quantum Statistical Inference , 1992 .

[18]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[19]  S. Amari Differential Geometry of Curved Exponential Families-Curvatures and Information Loss , 1982 .

[20]  Shun-ichi Amari,et al.  Statistical inference under multiterminal rate restrictions: A differential geometric approach , 1989, IEEE Trans. Inf. Theory.

[21]  L. L. Campbell,et al.  The relation between information theory and the differential geometry approach to statistics , 1985, Inf. Sci..