Information geometry of neural learning and belief propagation

Summary form only given. Many complex systems such as the brain work sufficiently well under stochastic uncertainty and noise fluctuations. Learning and statistical inference play a key role in such systems. Information geometry emerged from studies of the intrinsic structure of the manifold of probability distributions and is applicable to a wide variety of information systems. Information geometry provides a new tool for elucidating the intrinsic structure of such systems. The article gives an understandable introduction to information geometry. We then show how the geometric concepts are applied to learning algorithms in neural networks. The set of neural networks, for example multilayer perceptrons, forms a manifold, where modifiable parameters are regarded as coordinates. This is called the neuromanifold, which is Riemannian, and in which learning takes place. The dynamics of learning is represented by a trajectory in this Riemannian space. By using this example, we explain why the geometrical consideration is important. We further propose a new efficient learning method called the natural gradient method which outperforms the error backpropagation method. Belief propagation is an important method developed in artificial intelligence, where stochastic inference takes place under graphical models. This is a hot topic connecting artificial intelligence, learning, error correcting codes and stochastic reasoning. Information geometry elucidates the intrinsic structure of the belief propagation method, and unifies the AI approach, information coding, statistical approach, and statistical physical method. Such hot topics are also explained.