Gradient-based learning algorithms for recurrent connectionist networks

Recurrent connectionist networks are important because they can perform temporally extended tasks, giving them considerable power beyond the static mappings performed by the now-familiar multilayer feedforward networks. This ability to perform highly nonlinear dynamic mappings makes these networks particularly interesting to study and potentially quite useful in tasks which have an important temporal component not easily handled through the use of simple tapped delay lines. Some examples are tasks involving recognition or generation of sequential patterns and sensorimotor control. This report examines a number of learning procedures for adjusting the weights in recurrent networks in order to train such networks to produce desired temporal behaviors from input-output stream examples. The procedures are all based on the computation of the gradient of performance error with respect to network weights, and a number of strategies for computing the necessary gradient information are described. Included here are approaches which are familiar and have been rst described elsewhere, along with several novel approaches. One particular purpose of this report is to provide uniform and detailed descriptions and derivations of the various techniques in order to emphasize how they relate to one another. Another important contribution of this report is a detailed analysis of the computational requirements of the various approaches discussed.

[1]  L. Mcbride,et al.  Optimization of time-varying systems , 1965 .

[2]  J. Meditch,et al.  Applied optimal control , 1972, IEEE Transactions on Automatic Control.

[3]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[6]  Lokendra Shastri,et al.  Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.

[7]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[8]  Fernando J. Pineda,et al.  Dynamics and architecture for neural computation , 1988, J. Complex..

[9]  Yann Le Cun,et al.  A Theoretical Framework for Back-Propagation , 1988 .

[10]  M. Gherrity A learning algorithm for analog, fully recurrent neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[11]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[12]  Ronald J. Williams,et al.  Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[13]  B. Baird A bifurcation theory approach to vector field programming for periodic attractors , 1989, International 1989 Joint Conference on Neural Networks.

[14]  Michael C. Mozer,et al.  A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[15]  A. W. Smith,et al.  Encoding sequential structure: experience with the real-time recurrent learning algorithm , 1989, International 1989 Joint Conference on Neural Networks.

[16]  Richard Rohwer,et al.  The "Moving Targets" Training Algorithm , 1989, NIPS.

[17]  Fernando J. Pineda,et al.  Recurrent Backpropagation and the Dynamical Approach to Adaptive Neural Computation , 1989, Neural Computation.

[18]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[19]  David Zipser,et al.  A Subgrouping Strategy that Reduces Complexity and Speeds Up Learning in Recurrent Networks , 1989, Neural Computation.

[20]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[21]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[22]  Kenji Doya,et al.  Adaptive neural oscillator using continuous-time back-propagation learning , 1989, Neural Networks.

[23]  Garrison W. Cottrell,et al.  Some experiments on learning stable network oscillations , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[24]  Ronald J. Williams,et al.  Adaptive state representation and estimation using recurrent connectionist networks , 1990 .

[25]  L. B. Almeida A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .

[26]  Michael I. Jordan Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .