Learning to generate subgoals for action sequences

Summary form only given. None of the existing learning algorithms for neural networks in time-varying environments addresses the problems of learning to 'divide and conquer'. Algorithms based on pure gradient descent or on adaptive critic methods are not suitable for dynamic control problems with long time lags between actions and consequences, and that there is a need for algorithms that perform 'compositional learning'. The author discusses a system which solves at least one problem associated with compositional learning. The system learns to generate subgoals. This is done with the help of 'time-bridging' adaptive models that predict the effects of the system's subprograms. An experiment on obstacle avoidance in a two-dimensional environment illustrates the approach.<<ETX>>