Solving the Ill-Conditioning in Neural Network Learning

In this paper we investigate the feed-forward learning problem. The well-known ill-conditioning which is present in most feed-forward learning problems is shown to be the result of the structure of the network. Also, the well-known problem that weights between ‘higher’ layers in the network have to settle before ‘lower’ weights can converge is addressed. We present a solution to these problems by modifying the structure of the network through the addition of linear connections which carry shared weights. We call the new network structure the linearly augmented feed-forward network, and it is shown that the universal approximation theorems are still valid. Simulation experiments show the validity of the new method, and demonstrate that the new network is less sensitive to local minima and learns faster than the original network.

[1]  K. P. Unnikrishnan,et al.  Alopex: A Correlation-Based Learning Algorithm for Feedforward and Recurrent Neural Networks , 1994, Neural Computation.

[2]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[3]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[4]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[5]  Yongjun Zhang,et al.  Local-sparse connection multilayer networks , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[6]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[7]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[8]  George Cybenko,et al.  Ill-Conditioning in Neural Network Training Problems , 1993, SIAM J. Sci. Comput..

[9]  Eduardo D. Sontag,et al.  Backpropagation Can Give Rise to Spurious Local Minima Even for Networks without Hidden Layers , 1989, Complex Syst..

[10]  N. Schraudolph On Centering Neural Network Weight Updates ? , 1997 .

[11]  Hava T. Siegelmann,et al.  On the complexity of training neural networks with continuous activation functions , 1995, IEEE Trans. Neural Networks.

[12]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.