Towards Compositional Learning in Dynamic Networks

None of the existing learning algorithms for neural networks with internal and/or external feedback addresses the problem of learning by composing subprograms, of learning`to divide and conquer'. In this work it is argued that algorithms based on pure gradient descent or on temporal diierence methods are not suitable for large scale dynamic control problems, and that there is a need for algorithms that perform`compositional learning'. Some problems associated with compositional learning are identiied, and a system is described which attacks at least one of them. The system learns to generate sub-goals that help to achieve its main goals. This is done with the help of`time-bridging' adaptive models that predict the eeects of the system's sub-programs. An experiment is reported which demonstrates the feasibility of the method.

[1]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[2]  Charles W. Anderson,et al.  Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .

[3]  Paul J. Werbos,et al.  Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Anthony J. Robinson,et al.  Static and Dynamic Error Propagation Networks with Application to Speech Coding , 1987, NIPS.

[5]  Michael I. Jordan Supervised learning and systems with excess degrees of freedom , 1988 .

[6]  M. Gherrity,et al.  A learning algorithm for analog, fully recurrent neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[7]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[8]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[9]  Frank Fallside,et al.  Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .

[10]  Ronald J. Williams,et al.  Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[11]  Jürgen Schmidhuber,et al.  A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .

[12]  Jürgen Schmidhuber,et al.  Reinforcement Learning with Interacting Continually Running Fully Recurrent Networks , 1990 .

[13]  P. Werbos,et al.  Expectation Driven Learning with an Associative Memory , 1990 .

[14]  Paul J. Werbos,et al.  Consistency of HDP applied to a simple reinforcement learning problem , 1990, Neural Networks.

[15]  Jürgen Schmidhuber,et al.  Recurrent networks adjusted by adaptive critics , 1990 .

[16]  Jürgen Schmidhuber,et al.  Networks adjusting networks , 1990 .

[17]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[18]  Jürgen Schmidhuber,et al.  An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.