Solving the Ill-Conditioning in Neural Network Learning

In this paper we investigate the feed-forward learning problem. The well-known ill-conditioning which is present in most feed-forward learning problems is shown to be the result of the structure of the network. Also, the well-known problem that weights between ‘higher’ layers in the network have to settle before ‘lower’ weights can converge is addressed. We present a solution to these problems by modifying the structure of the network through the addition of linear connections which carry shared weights. We call the new network structure the linearly augmented feed-forward network, and it is shown that the universal approximation theorems are still valid. Simulation experiments show the validity of the new method, and demonstrate that the new network is less sensitive to local minima and learns faster than the original network.

[1]  H. Robbins A Stochastic Approximation Method , 2007 .

[2]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[3]  B. V. K. Vijaya Kumar,et al.  Emulating the dynamics for a class of laterally inhibited neural networks , 1989, Neural Networks.

[4]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[5]  Hava T. Siegelmann,et al.  On the complexity of training neural networks with continuous activation functions , 1995, IEEE Trans. Neural Networks.

[6]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[7]  Eduardo D. Sontag,et al.  Backpropagation Can Give Rise to Spurious Local Minima Even for Networks without Hidden Layers , 1989, Complex Syst..

[8]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[9]  Yongjun Zhang,et al.  Local-sparse connection multilayer networks , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[10]  George Cybenko,et al.  Ill-Conditioning in Neural Network Training Problems , 1993, SIAM J. Sci. Comput..

[11]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[12]  K. P. Unnikrishnan,et al.  Alopex: A Correlation-Based Learning Algorithm for Feedforward and Recurrent Neural Networks , 1994, Neural Computation.

[13]  Patrick van der Smagt Minimisation methods for training feedforward neural networks , 1994, Neural Networks.

[14]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.