论文信息 - Reinforcement learning vs human programming in tetherball robot games

Reinforcement learning vs human programming in tetherball robot games

Reinforcement learning of motor skills is an important challenge in order to endow robots with the ability to learn a wide range of skills and solve complex tasks. However, comparing reinforcement learning against human programming is not straightforward. In this paper, we create a motor learning framework consisting of state-of-the-art components in motor skill learning and compare it to a manually designed program on the task of robot tetherball. We use dynamical motor primitives for representing the robot's trajectories and relative entropy policy search to train the motor framework and improve its behavior by trial and error. These algorithmic components allow for high-quality skill learning while the experimental setup enables an accurate evaluation of our framework as robot players can compete against each other. In the complex game of robot tetherball, we show that our learning approach outperforms and wins a match against a high quality hand-crafted system.

[1] V. Gullapalli,et al. Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.

[2] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[3] Stefan Schaal,et al. Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4] K. Lynch. Nonholonomic Mechanics and Control , 2004, IEEE Transactions on Automatic Control.

[5] Jun Morimoto,et al. Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[6] D. Simon. Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[7] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[8] Stefan Schaal,et al. Learning Policy Improvements with Path Integrals , 2010, AISTATS.

[9] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .

[10] Jan Peters,et al. Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.

[12] Darwin G. Caldwell,et al. Reinforcement Learning in Robotics: Applications and Real-World Challenges , 2013, Robotics.

[13] Oskar von Stryk,et al. Design and dynamics model of a lightweight series elastic tendon-driven robot arm , 2013, 2013 IEEE International Conference on Robotics and Automation.

[14] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[15] Hany Abdulsamad,et al. Playing Tetherball with Compliant Robots , 2014 .