Information-Theoretic Motor Skill Learning

While there has been recent successes in learning single control policies, there are several several open challenges in robot motor skill learning. Firstly, many motor tasks can be solved in multiple ways, and, hence, we need to be able to learn each of these solutions as separate options from which the agent can choose from. Furthermore, we need to learn how to adapt an option to the current situation. Finally, we need to be able to combine several options sequentially in order to solve an overall-task. As we want to use our method for real robots, a high data efficiency is a natural additional requirement for motor skill learning. In this paper we summarize our work on information-theoretic motor skill learning. We show how to adapt the relative entropy policy search (REPS) algorithm for learning parametrized options and extend the algorithm in a mathematical sound way such that it can meet all these requirements. Finally, we summarize our experiments conducted on real robots.

[1]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[2]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[3]  Oliver Kroemer,et al.  Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.

[4]  Jeff G. Schneider,et al.  Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[5]  Bruno Castro da Silva,et al.  Learning Parameterized Skills , 2012, ICML.

[6]  Jeff G. Schneider,et al.  Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.

[7]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[8]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[10]  Gerhard Neumann,et al.  Variational Inference for Policy Search in changing situations , 2011, ICML.

[11]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[12]  Christopher G. Atkeson,et al.  A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[13]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[14]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[15]  Stefan Schaal,et al.  Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[16]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[17]  Jan Peters,et al.  Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Jan Peters,et al.  Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.

[19]  Christoph H. Lampert,et al.  Movement templates for learning of hitting and batting , 2010, 2010 IEEE International Conference on Robotics and Automation.

[20]  Heni Ben Amor,et al.  Kinesthetic Bootstrapping: Teaching Motor Skills to Humanoid Robots through Physical Interaction , 2009, KI.

[21]  Jan Peters,et al.  Hierarchical Relative Entropy Policy Search , 2014, AISTATS.

[22]  Jan Peters,et al.  Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.