Information Geometry of Neural Networks

Japan has launched a big Brain Science Program which includes theoretical foundations of neurocomputing. Mathematical foundation of brain-style computation is one of die main targets of our laboratory in the RIKEN Brain Science Institute. The present talk will introduce the Japanese Brain Science Program, and then give a direction toward mathematical foundation of neurocomputing. A neural network is specified by a number of real free parameters (connection weights or synaptic efficacies) which are modifiable by learning. The set of all such networks forms a multi-dimensional manifold. In order to understand the total capability of such networks, it is useful to study the intrinsic geometrical structure of the neuromanifold. When a network is disturbed by noise, its behavior is given by a conditional probability distribution. In such a case. Information Geometry gives a fundamental geometrical structure. We apply information geometry to the set of multi-layer perceptrons. Because it is a Riemannian space, we are naturally lead to the Riemannian or natural gradient learning method, which proves to give a strikingly fast and accurate learning algorithm. The geometry also proves that various types of singularities exist in the manifold, which are not peculiar to neural networks but common to all the hierarchical systems. The sigularities give severe influence on learning behaviors. All of these aspects are analyzed mathematically.

[1]  Magnus Rattray,et al.  Natural gradient descent for on-line learning , 1998 .

[2]  S. Amari Natural Gradient Works Eciently in Learning , 2022 .

[3]  Saad,et al.  On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[4]  Kenji Fukumizu,et al.  Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.

[5]  Shun-ichi Amari,et al.  Complexity Issues in Natural Gradient Descent Method for Training Multilayer Perceptrons , 1998, Neural Computation.