论文信息 - METAL: A framework for mixture-of-experts task and attention learning

METAL: A framework for mixture-of-experts task and attention learning

Rapid increase in the size and complexity of sensory systems demands for attention control in real world robotic tasks. However, attention control and the task are often highly interlaced which demands for interactive learning. In this paper, a framework called METAL mixture-of-experts task and attention learning is proposed to cope with this complex learning problem. METAL consists of three consecutive learning phases, where the first two phases provide an initial knowledge about the task, while in the third phase the attention control is learned concurrently with the task. The mind of the robot is composed of a set of tiny agents learning and acting in parallel in addition to an attention control learning ACL agent. Each tiny agent provides the ACL agent with some partial knowledge about the task in the form of its decision preference-called policy as well. The ACL agent in the third phase learns how to make the final decision by attending the least possible number of tiny agents. It acts on a continuous decision space which gives METAL the ability to integrate different sources of knowledge with ease. A Bayesian continuous RL method is utilized at both levels of learning on perceptual and decision spaces. Implementation of METAL on an E-puck robot in a miniature highway driving task along with farther simulation studies in Webots™ environment verify the applicability and effectiveness of the proposed framework, where a smooth driving behavior is shaped. It is also shown that even though the robot has learned to discard some sensory data, probability of raising aliasing in the decision space is very low, which means that the robot can learn the task as well as attention control simultaneously.

[1] J. Maunsell,et al. Feature-based attention in visual cortex , 2006, Trends in Neurosciences.

[2] Babak Nadjar Araabi,et al. Concurrent learning of task and attention control in the decision space , 2009, 2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics.

[3] Majid Nili Ahmadabadi,et al. Knowledge-based Extraction of Area of Expertise for Cooperation in Learning , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[5] James C. Bezdek,et al. Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[6] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .

[7] Minoru Asada,et al. Action-Based Sensor Space Segmentation for Soccer Robot Learning , 1998, Appl. Artif. Intell..

[8] Majid Nili Ahmadabadi,et al. Expertness based cooperative Q-learning , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[9] Majid Nili Ahmadabadi,et al. Online learning of task-driven object-based visual attention control , 2010, Image Vis. Comput..

[10] G. Rizzolatti,et al. Neurophysiological mechanisms underlying the understanding and imitation of action , 2001, Nature Reviews Neuroscience.

[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12] Erik D. Reichle,et al. Using reinforcement learning to understand the emergence of "intelligent" eye-movement behavior during reading. , 2006, Psychological review.

[13] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[14] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.

[15] Johan de Kleer,et al. A Qualitative Physics Based on Confluences , 1984, Artif. Intell..

[16] Aude Billard,et al. A probabilistic Programming by Demonstration framework handling constraints in joint space and task space , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] W. James,et al. The Principles of Psychology. , 1983 .

[18] Majid Nili Ahmadabadi,et al. Speeding up top-down attention control learning by using full observation knowledge , 2009, 2009 IEEE International Symposium on Computational Intelligence in Robotics and Automation - (CIRA).

[19] Majid Nili Ahmadabadi,et al. A Probabilistic Reinforcement-Based Approach to Conceptualization , 2008 .

[20] Majid Nili Ahmadabadi,et al. Comparing Learning Attention Control in Perceptual and Decision Space , 2009, WAPCV.

[21] H. R. Berenji,et al. Fuzzy Q-learning for generalization of reinforcement learning , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[22] Hossein Mobahi,et al. A BIOLOGICALLY INSPIRED METHOD FOR CONCEPTUAL IMITATION USING REINFORCEMENT LEARNING , 2007, Appl. Artif. Intell..

[23] Christof Koch,et al. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[24] Lucas Paletta,et al. Cascaded Sequential Attention for Object Recognition with Informative Local Descriptors and Q-learning of Grouping Strategies , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[25] John K. Tsotsos,et al. Attention links sensing to recognition , 2008, Image Vis. Comput..

[26] Majid Nili Ahmadabadi,et al. Biologically Inspired Framework for Learning and Abstract Representation of Attention Control , 2008, WAPCV.