On the Geometry of Feedforward Neural Network Error Surfaces

Many feedforward neural network architectures have the property that their overall input-output function is unchanged by certain weight permutations and sign flips. In this paper, the geometric structure of these equioutput weight space transformations is explored for the case of multilayer perceptron networks with tanh activation functions (similar results hold for many other types of neural networks). It is shown that these transformations form an algebraic group isomorphic to a direct product of Weyl groups. Results concerning the root spaces of the Lie algebras associated with these Weyl groups are then used to derive sets of simple equations for minimal sufficient search sets in weight space. These sets, which take the geometric forms of a wedge and a cone, occupy only a minute fraction of the volume of weight space. A separate analysis shows that large numbers of copies of a network performance function optimum weight vector are created by the action of the equioutput transformation group and that these copies all lie on the same sphere. Some implications of these results for learning are discussed.

[1]  H. Weyl The Classical Groups , 1939 .

[2]  S. Helgason Differential Geometry and Symmetric Spaces , 1964 .

[3]  J. Humphreys Introduction to Lie Algebras and Representation Theory , 1973 .

[4]  V. Varadarajan Lie groups, Lie algebras, and their representations , 1974 .

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[7]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[8]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[9]  Robert Hecht-Nielsen,et al.  Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[10]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[11]  James D. Keeler,et al.  Layered Neural Networks with Gaussian Hidden Units as Universal Approximations , 1990, Neural Computation.

[12]  R. Hecht-Nielsen ON THE ALGEBRAIC STRUCTURE OF FEEDFORWARD NETWORK WEIGHT SPACES , 1990 .

[13]  R. Hecht-Nielsen,et al.  On the geometry of feedforward neural network weight spaces , 1991 .

[14]  Chris Bishop,et al.  Improving the Generalization Properties of Radial Basis Function Neural Networks , 1991, Neural Computation.

[15]  Stephen Grossberg,et al.  ARTMAP: supervised real-time learning and classification of nonstationary data by a self-organizing neural network , 1991, [1991 Proceedings] IEEE Conference on Neural Networks for Ocean Engineering.

[16]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[17]  C. M. Bishop,et al.  Curvature-Driven Smoothing in Backpropagation Neural Networks , 1992 .

[18]  Héctor J. Sussmann,et al.  Uniqueness of the weights for minimal feedforward nets with a given input-output map , 1992, Neural Networks.

[19]  D. Saad Explicit symmetries and the capacity of multilayer neural networks , 1994 .

[20]  Paul C. Kainen,et al.  Functionally Equivalent Feedforward Neural Networks , 1994, Neural Computation.