Vision Based State Space Construction for Learning Mobile Robots in Multi-agent Environments

State space construction is one of the most fundamental issues for reinforcement learning methods to be applied to real robot tasks because they need a well-defined state space so that they can converge correctly. Especially in multi-agent environments, the problem becomes more difficult since visual information observed by a learning robot seems irrelevant to its self motion due to actions by other agents of which policies are unknown. This paper proposes a method which estimates the relationship between the learner's behaviors and the other agents' ones in the environment through interactions (observation and action) using the method of system identification to construct a state space in such an environment. In order to determine the state vectors of each agent, Akaike's Information Criterion is applied to the result of the system identification. Next, reinforcement learning based on the estimated state vectors is utilized to obtain the optimal behavior. The proposed method is applied to soccer playing physical agents, which learn to cope with a rolling ball and another moving agent. The computer simulations and the real experiments are shown and a discussion is given.

[1]  Minoru Asada,et al.  Action-based sensor space categorization for robot learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[2]  M. Asada,et al.  Learning and Development for Physical Animats: Environmental Complexity Control for Vision-Based Mobile Robot , 1997 .

[3]  Wallace E. Larimore,et al.  Canonical variate analysis in identification, filtering, and adaptive control , 1990, 29th IEEE Conference on Decision and Control.

[4]  Sridhar Mahadevan,et al.  Robot Learning , 1993 .

[5]  Tuomas Sandholm,et al.  On Multiagent Q-Learning in a Semi-Competitive Domain , 1995, Adaption and Learning in Multi-Agent Systems.

[6]  Hiroaki Kitano,et al.  RoboCup: A Challenge Problem for AI , 1997, AI Mag..

[7]  H. Akaike A new look at the statistical model identification , 1974 .

[8]  Minoru Asada,et al.  Behavior coordination for a mobile robot using modular reinforcement learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[9]  Tom M. Mitchell,et al.  Reinforcement learning with hidden states , 1993 .

[10]  Minoru Asada,et al.  Environmental complexity control for vision-based learning mobile robot , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[11]  Minoru Asada,et al.  Vision-based reinforcement learning for purposive behavior acquisition , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[12]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[13]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[14]  Bart De Moor,et al.  A unifying theorem for three subspace system identification algorithms , 1995, Autom..

[15]  Masayuki Inaba,et al.  Remote-Brained Robotics : Interfacing AI with Real World Behaviors , 1993 .