Supervised Models C 1 . 2 Multilayer perceptrons

This section introduces multilayer perceptrons, which are the most commonly used type of neural network. The popular backpropagation training algorithm is studied in detail. The momentum and adaptive step size techniques, which are used for accelerated training, are discussed. Other acceleration techniques are briefly referenced. Several implementation issues are then examined. The issue of generalization is studied next. Several measures to improve network generalization are discussed, including cross validation, choice of network size, network pruning, constructive algorithms and regularization. Recurrent networks are then studied, both in the fixed point mode, with the recurrent backpropagation algorithm, and in the sequential mode, with the unfolding in time algorithm. A reference is also made to time-delay neural networks. The section also includes brief mention of a large number of applications of multilayer perceptrons, with pointers to the bibliography.

[1]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[2]  Gregory J. Wolff,et al.  Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[3]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[4]  C. M. Bishop,et al.  Curvature-Driven Smoothing in Backpropagation Neural Networks , 1992 .

[5]  Yann LeCun,et al.  Multi-Digit Recognition Using a Space Displacement Neural Network , 1991, NIPS.

[6]  C. Lee Giles,et al.  Extracting and Learning an Unknown Grammar with Recurrent Neural Networks , 1991, NIPS.

[7]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[8]  Luís B. Almeida,et al.  SPEEDING-UP BACKPROPAGATION BY DATA ORTHONORMALIZATION† , 1991 .

[9]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[10]  D. Mackay,et al.  A Practical Bayesian Framework for Backprop Networks , 1991 .

[11]  Tom Tollenaere,et al.  SuperSAB: Fast adaptive back propagation with good scaling properties , 1990, Neural Networks.

[12]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[13]  L. B. Almeida A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[14]  Luís B. Almeida,et al.  Speeding up Backpropagation , 1990 .

[15]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[16]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[17]  J. Nadal,et al.  Learning in feedforward layered networks: the tiling algorithm , 1989 .

[18]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[19]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[20]  Alexander H. Waibel,et al.  Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[21]  Yann LeCun,et al.  Improving the convergence of back-propagation learning with second-order methods , 1989 .

[22]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[23]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[24]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[25]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[26]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[27]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[28]  A. Lapedes,et al.  Nonlinear signal processing using neural networks: Prediction and system modelling , 1987 .

[29]  Lokendra Shastri,et al.  Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.

[30]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[31]  L. Ljung Strong Convergence of a Stochastic Approximation Algorithm , 1978 .

[32]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .