Forward Models: Supervised Learning with a Distal Teacher

Internal models of the environment have an important role to play in adaptive systems, in general, and are of particular importance for the supervised learning paradigm. In this article we demonstrate that certain classical problems associated with the notion of the “teacher” in supervised learning can be solved by judicious use of learned internal models as components of the adaptive system. In particular, we show how supervised learning algorithms can be utilized in cases in which an unknown dynamical system intervenes between actions and desired outcomes. Our approach applies to any supervised learning algorithm that is capable of learning in multilayer networks.

[1]  S. Brendle,et al.  Calculus of Variations , 1927, Nature.

[2]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[3]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[4]  A. Holden Competition and cooperation in neural nets , 1983 .

[5]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  Lennart Ljung,et al.  Theory and Practice of Recursive Identification , 1983 .

[7]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[8]  N. Hogan An organizing principle for a class of voluntary movements , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[9]  Graham C. Goodwin,et al.  Adaptive filtering prediction and control , 1984 .

[10]  David Zipser,et al.  Feature discovery by competitive learning , 1986 .

[11]  Yann LeCun,et al.  Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[14]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[15]  Geoffrey E. Hinton,et al.  Schemata and Sequential Thought Processes in PDP Models , 1986 .

[16]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[17]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[18]  Paul J. Werbos,et al.  Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  Yann LeCun PhD thesis: Modeles connexionnistes de l'apprentissage (connectionist learning models) , 1987 .

[20]  W. Thomas Miller,et al.  Sensor-based control of robotic manipulators using a general learning algorithm , 1987, IEEE J. Robotics Autom..

[21]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[22]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[23]  David J. Reinkensmeyer,et al.  Using associative content-addressable memories to control robots , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.

[24]  Michael I. Jordan Supervised learning and systems with excess degrees of freedom , 1988 .

[25]  M Kuperstein,et al.  Neural model of adaptive hand-eye coordination for single postures. , 1988, Science.

[26]  Andrew G. Barto,et al.  From Chemotaxis to cooperativity: abstract exercises in neuronal learning strategies , 1989 .

[27]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[28]  Frank Fallside,et al.  Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .

[29]  Michael I. Jordan,et al.  Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.

[30]  M. Posner Foundations of cognitive science , 1989 .

[31]  David J. Reinkensmeyer,et al.  Using associative content-addressable memories to control robots , 1989, Proceedings, 1989 International Conference on Robotics and Automation.

[32]  Michael C. Mozer,et al.  Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.

[33]  Mitsuo Kawato,et al.  Computational schemes and neural network models for formulation and control of multijoint arm trajectory , 1990 .

[34]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[35]  Jürgen Schmidhuber,et al.  An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[36]  Phillip J. McKerrow,et al.  Introduction to robotics , 1991 .

[37]  Michael I. Jordan Constrained supervised learning , 1992 .