Simultaneous learning of spatial visual attention and physical actions

This paper introduces a new method for learning top-down and task-driven visual attention control along with physical actions in interactive environments. Our method is based on the Reinforcement Learning of Visual Classes(RLVC) algorithm and adapts it for learning spatial visual selection in order to reduce computational complexity. Proposed algorithm also addresses aliasings due to not knowing previous actions and perceptions. Continuing learning shows our method is robust to perturbations in perceptual information. Our method also allows object recognition when class labels are used instead of physical actions. We have tried to gain maximum generalization while performing local processing. Experiments over visual navigation and object recognition tasks show that our method is more efficient in terms of computational complexity and is biologically more plausible.

[1]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[2]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[3]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[4]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  K. Nakayama,et al.  Priming of pop-out: I. Role of features , 1994, Memory & cognition.

[7]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[8]  Justus H. Piater,et al.  Closed-Loop Learning of Visual Control Policies , 2011, J. Artif. Intell. Res..

[9]  A. L. Yarbus Eye Movements During Perception of Complex Objects , 1967 .

[10]  Andrew W. Moore,et al.  Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.

[11]  E. Gibson,et al.  The development of perception , 1983 .

[12]  Manuela M. Veloso,et al.  Tree Based Discretization for Continuous State Space Reinforcement Learning , 1998, AAAI/IAAI.

[13]  Leslie Pack Kaelbling,et al.  Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[14]  I. Sigel,et al.  HANDBOOK OF CHILD PSYCHOLOGY , 2006 .

[15]  D. Ballard,et al.  What you see is what you need. , 2003, Journal of vision.

[16]  M. Tarr,et al.  Learning to see faces and objects , 2003, Trends in Cognitive Sciences.

[17]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[18]  Majid Nili Ahmadabadi,et al.  Learning sequential visual attention control through dynamic state space discretization , 2009, 2009 IEEE International Conference on Robotics and Automation.