Steepest descent algorithms for neural network controllers and filters

A number of steepest descent algorithms have been developed for adapting discrete-time dynamical systems, including the backpropagation through time and recursive backpropagation algorithms. In this paper, a tutorial on the use of these algorithms for adapting neural network controllers and filters is presented. In order to effectively compare and contrast the algorithms, a unified framework for the algorithms is developed. This framework is based upon a standard representation of a discrete-time dynamical system. Using this framework, the computational and storage requirements of the algorithms are derived. These requirements are used to select the appropriate algorithm for training a neural network controller or filter. Finally, to illustrate the usefulness of the techniques presented in this paper, a neural network control example and a neural network filtering example are presented.

[1]  Yann LeCun,et al.  A theoretical framework for back-propagation , 1988 .

[2]  K. Narendra,et al.  Multiparameter self-optimizing systems using correlation techniques , 1964 .

[3]  Jürgen Schmidhuber,et al.  An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[4]  Raymond L. Watrous,et al.  Connected recognition with a recurrent network , 1990, Speech Commun..

[5]  Bernard Widrow,et al.  30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[6]  Richard S. Sutton,et al.  Computational Schemes and Neural Network Models for Formation and Control of Multijoint Arm Trajectory , 1995 .

[7]  M. Gherrity,et al.  A learning algorithm for analog, fully recurrent neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[8]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[9]  L. Mcbride,et al.  Optimization of time-varying systems , 1965 .

[10]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[11]  K S Narendra,et al.  IDENTIFICATION AND CONTROL OF DYNAMIC SYSTEMS USING NEURAL NETWORKS , 1990 .

[12]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[13]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[14]  Ronald J. Williams,et al.  Adaptive state representation and estimation using recurrent connectionist networks , 1990 .

[15]  S. Liberty,et al.  Linear Systems , 2010, Scientific Parallel Computing.

[16]  Mitsuo Kawato,et al.  Computational schemes and neural network models for formulation and control of multijoint arm trajectory , 1990 .

[17]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[18]  A. Sideris,et al.  A multilayered neural network controller , 1988, IEEE Control Systems Magazine.

[19]  Bernard Widrow,et al.  Adaptive Signal Processing , 1985 .

[20]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[21]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[22]  J. Shynk Adaptive IIR filtering , 1989, IEEE ASSP Magazine.

[23]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[24]  Derrick H. Nguyen,et al.  Neural networks for self-learning control systems , 1990 .

[25]  M. Gori,et al.  BPS: a learning algorithm for capturing the dynamic nature of speech , 1989, International 1989 Joint Conference on Neural Networks.

[26]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[27]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[28]  Lokendra Shastri,et al.  Learning Phonetic Features Using Connectionist Networks , 1987, IJCAI.

[29]  Arthur E. Bryson,et al.  Applied Optimal Control , 1969 .

[30]  Raymond L. Watrous,et al.  Complete gradient optimization of a recurrent network applied to /b/,/d/,/g/ discrimination , 1988 .

[31]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1988, Neural Computation.

[32]  Robert Hecht-Nielsen,et al.  Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[33]  Eric A. Wan,et al.  Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity , 1994, Neural Computation.

[34]  K. Shikano,et al.  Parallelism, hierarchy, scaling in time-delay neural networks for spotting Japanese phonemes CV-syllables , 1989, International 1989 Joint Conference on Neural Networks.

[35]  S Z Qin,et al.  Comparison of four neural net learning methods for dynamic system identification , 1992, IEEE Trans. Neural Networks.

[36]  Kumpati S. Narendra,et al.  Gradient methods for the optimization of dynamical systems containing neural networks , 1991, IEEE Trans. Neural Networks.

[37]  Michael I. Jordan,et al.  Generic constraints on underspecified target trajectories , 1989, International 1989 Joint Conference on Neural Networks.

[38]  Yann Le Cun,et al.  A Theoretical Framework for Back-Propagation , 1988 .