Learning to Generate Artificial Fovea Trajectories for Target Detection

This paper shows how ‘static’ neural approaches to adaptive target detection can be replaced by a more efficient and more sequential alternative. The latter is inspired by the observation that biological systems employ sequential eye movements for pattern recognition. A system is described, which builds an adaptive model of the time-varying inputs of an artificial fovea controlled by an adaptive neural controller. The controller uses the adaptive model for learning the sequential generation of fovea trajectories causing the fovea to move to a target in a visual scene. The system also learns to track moving targets. No teacher provides the desired activations of ‘eye muscles’ at various times. The only goal information is the shape of the target. Since the task is a ‘reward-only-at-goal’ task, it involves a complex temporal credit assignment problem. Some implications for adaptive attentive systems in general are discussed.

[1]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[2]  Yann LeCun,et al.  Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[5]  Michael I. Jordan Supervised learning and systems with excess degrees of freedom , 1988 .

[6]  R. J. Williams,et al.  On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.

[7]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[8]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[9]  Frank Fallside,et al.  Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .

[10]  C. Watkins Learning from delayed rewards , 1989 .

[11]  P. J. Werbos,et al.  Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.

[12]  Jürgen Schmidhuber,et al.  Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[13]  urgen Schmidhuber Towards Compositional Learning in Dynamic NetworksTechnical Report , 1990 .

[14]  T. Sejnowski,et al.  Learning Algorithms for Networks with Internal and External Feedback , 1990 .

[15]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[16]  Jürgen Schmidhuber An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[17]  Stewart W. Wilson,et al.  A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .