Information Geometry and Its Applications: Convex Function and Dually Flat Manifold

Information geometry emerged from studies on invariant properties of a manifold of probability distributions. It includes convex analysis and its duality as a special but important part. Here, we begin with a convex function, and construct a dually flat manifold. The manifold possesses a Riemannian metric, two types of geodesics, and a divergence function. The generalized Pythagorean theorem and dual projections theorem are derived therefrom. We construct alpha-geometry, extending this convex analysis. In this review, geometry of a manifold of probability distributions is then given, and a plenty of applications are touched upon. Appendix presents an easily understable introduction to differential geometry and its duality.

[1]  A. Rényi On Measures of Entropy and Information , 1961 .

[2]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[3]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[4]  N. N. Chent︠s︡ov Statistical decision rules and optimal inference , 1982 .

[5]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[6]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[7]  William J. Byrne,et al.  Alternating minimization and Boltzmann machine learning , 1992, IEEE Trans. Neural Networks.

[8]  Shun-ichi Amari,et al.  Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[9]  K. Nomizu Affine Differential Geometry , 1994 .

[10]  Shun-ichi Amari,et al.  Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.

[11]  S. Amari,et al.  Information geometry of estimating functions in semi-parametric statistical models , 1997 .

[12]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[13]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[14]  John D. Lafferty,et al.  Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[15]  Shun-ichi Amari,et al.  Information geometry on hierarchy of probability distributions , 2001, IEEE Trans. Inf. Theory.

[16]  Shun-ichi Amari,et al.  Information-Geometric Measure for Neural Spikes , 2002, Neural Computation.

[17]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[18]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[19]  Yasuo Matsuyama,et al.  The alpha-EM algorithm: surrogate likelihood maximization using alpha-logarithmic information measures , 2003, IEEE Trans. Inf. Theory.

[20]  Atsumi Ohara,et al.  Geodesics for Dual Connections and Means on Symmetric Cones , 2004 .

[21]  Si Wu,et al.  Conformal Transformation of Kernel Functions: A Data-Dependent Way to Improve Support Vector Machine Classifiers , 2002, Neural Processing Letters.

[22]  Takafumi Kanamori,et al.  Information Geometry of U-Boost and Bregman Divergence , 2004, Neural Computation.

[23]  Shun-ichi Amari,et al.  Stochastic Reasoning, Free Energy, and Information Geometry , 2004, Neural Computation.

[24]  Jun Zhang,et al.  Divergence Function, Duality, and Convex Analysis , 2004, Neural Computation.

[25]  Shun-ichi Amari,et al.  Differential geometry of a parametric family of invertible linear systems—Riemannian metric, dual affine connections, and divergence , 1987, Mathematical systems theory.

[26]  Shun-ichi Amari,et al.  Singularities Affect Dynamics of Learning in Neuromanifolds , 2006, Neural Comput..

[27]  Pranab Kumar Sen,et al.  Statistics and Decisions , 2006 .