Making a Robot Learn to Play Soccer Using Reward and Punishment

In this paper, we show how reinforcement learning can be applied to real robots to achieve optimal robot behavior. As example, we enable an autonomous soccer robot to learn intercepting a rolling ball. Main focus is on how to adapt the Q-learning algorithm to the needs of learning strategies for real robots and how to transfer strategies learned in simulation onto real robots.

[1]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[2]  Minoru Asada,et al.  Vision-based reinforcement learning for purposive behavior acquisition , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[3]  Leemon C. Baird Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[4]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[5]  Thomas G. Dietterich Machine learning , 1996, CSUR.

[6]  John N. Tsitsiklis,et al.  Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.

[7]  Stephan Pareigis,et al.  Adaptive Choice of Grid and Time in Reinforcement Learning , 1997, NIPS.

[8]  Minoru Asada,et al.  Vision Based State Space Construction for Learning Mobile Robots in Multi-agent Environments , 1997, EWLR.

[9]  Hiroaki Kitano,et al.  RoboCup: A Challenge Problem for AI , 1997, AI Mag..

[10]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[11]  Andrew W. Moore,et al.  Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems , 1999, IJCAI.

[12]  Artur Merke,et al.  A Necessary Condition of Convergence for Reinforcement Learning with Function Approximation , 2002, ICML.

[13]  Artur Merke,et al.  Convergent Combinations of Reinforcement Learning with Linear Function Approximation , 2002, NIPS.

[14]  Sven Behnke,et al.  Predicting Away Robot Control Latency , 2003, RoboCup.

[15]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[16]  Minoru Asada,et al.  Behavior Learning for a Mobile Robot with Omnidirectional Vision Enhanced by an Active Zoom Mechanism , 2004 .

[17]  Peter Dayan,et al.  Technical Note: Q-Learning , 1992, Machine Learning.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[19]  Martin Lauer,et al.  Ego-Motion Estimation and Collision Detection for Omnidirectional Robots , 2006, RoboCup.

[20]  Motion Estimation of Moving Objects for Autonomous Mobile Robots , 2006, Künstliche Intell..

[21]  Martin A. Riedmiller,et al.  BRIDGING THE GAP : LEARNING IN THE ROBOCUP SIMULATION AND MIDSIZE LEAGUE , 2006 .

[22]  Martin A. Riedmiller,et al.  Learning a Partial Behavior for a Competitive Robotic Soccer Agent , 2006, Künstliche Intell..