Information Geometry on Hierarchical Decomposition of Stochastic Interactions

A joint probability distribution represents stochastic dependency among a number of random variables. They in general have complex structure of dependencies. A pairwise dependency is easily represented by correlation, but it is more diicult to extract triplewise or higher-order interactions (dependencies) among these variables. They form a hierarchy of dependencies. The present paper decomposes the higher order dependency into an "orthog-onal" sum of pairwise, triplewise and further higher-order dependencies, by using Information Geometry. This naturally gives a new invariant decomposition of joint entropy. This problem is important for extracting intrinsic interactions in ring patterns of an ensemble of neurons and for estimating its functional connections. The results are generalized to give an orthogonal decomposition in a wide class of hierarchical structures. As an example, we decompose higher order Markov chains into various lower order Markov 1 chains. This type of decomposition is related to the mean eld approximation in physics.

[1]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[2]  Te Sun Han Nonnegative Entropy Measures of Multivariate Symmetric Correlations , 1978, Inf. Control..

[3]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[4]  S. Amari Differential Geometry of Curved Exponential Families-Curvatures and Information Loss , 1982 .

[5]  L. L. Campbell,et al.  The relation between information theory and the differential geometry approach to statistics , 1985, Inf. Sci..

[6]  Byoung-Seon Choi,et al.  Conditional limit theorems under Markov conditioning , 1987, IEEE Trans. Inf. Theory.

[7]  Shun-ichi Amari,et al.  Statistical inference under multiterminal rate restrictions: A differential geometric approach , 1989, IEEE Trans. Inf. Theory.

[8]  S. Amari Fisher information under restriction of Shannon information in multi-terminal situations , 1989 .

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  Shun-ichi Amari,et al.  Dualistic geometry of the manifold of higher-order neurons , 1991, Neural Networks.

[11]  Shun-ichi Amari,et al.  Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[12]  C. R. Rao,et al.  Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .

[13]  Giovanni Pistone,et al.  An Infinite-Dimensional Geometric Structure on the Space of all the Probability Measures Equivalent to a Given One , 1995 .

[14]  Shun-ichi Amari,et al.  Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.

[15]  P. Marriott DIFFERENTIAL GEOMETRY AND STATISTICS , 1995 .

[16]  S. Amari,et al.  Dualistic differential geometry of positive definite matrices and its applications to related problems , 1996 .

[17]  Information Geometric Analysis of a Interior-Point Method for Semidefinite Programming , 1997 .

[18]  R. Kass,et al.  Geometrical Foundations of Asymptotic Inference , 1997 .

[19]  S. Amari,et al.  Information geometry of estimating functions in semi-parametric statistical models , 1997 .

[20]  Shun-ichi Amari,et al.  Statistical Inference Under Multiterminal Data Compression , 1998, IEEE Trans. Inf. Theory.

[21]  Michael I. Jordan Graphical Models , 1998 .

[22]  Giovanni Pistone,et al.  The Exponential Statistical Manifold: Mean Parameters, Orthogonality and Space Transformations , 1999 .

[23]  N. Čencov Statistical Decision Rules and Optimal Inference , 2000 .

[24]  Hiroyuki Ito,et al.  Model Dependence in Quantification of Spike Interdependence by Joint Peri-Stimulus Time Histogram , 2000, Neural Computation.

[25]  M. Opper,et al.  Information Geometry of Mean Field Approximation , 2001 .

[26]  A. Aertsen,et al.  On the significance of correlations among neuronal spike trains , 2004, Biological Cybernetics.

[27]  Shun-ichi Amari,et al.  Differential geometry of a parametric family of invertible linear systems—Riemannian metric, dual affine connections, and divergence , 1987, Mathematical systems theory.

[28]  J. M. BoardmanAbstract,et al.  Contemporary Mathematics , 2007 .

[29]  S. Amari Natural Gradient Works Eciently in Learning , 2022 .