Learning concurrent motor skills in versatile solution spaces

Future robots need to autonomously acquire motor skills in order to reduce their reliance on human programming. Many motor skill learning methods concentrate on learning a single solution for a given task. However, discarding information about additional solutions during learning unnecessarily limits autonomy. Such favoring of single solutions often requires re-learning of motor skills when the task, the environment or the robot's body changes in a way that renders the learned solution infeasible. Future robots need to be able to adapt to such changes and, ideally, have a large repertoire of movements to cope with such problems. In contrast to current methods, our approach simultaneously learns multiple distinct solutions for the same task, such that a partial degeneration of this solution space does not prevent the successful completion of the task. In this paper, we present a complete framework that is capable of learning different solution strategies for a real robot Tetherball task.

[1]  V. Gullapalli,et al.  Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.

[2]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[3]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[4]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[5]  Thomas G. Dietterich State Abstraction in MAXQ Hierarchical Reinforcement Learning , 1999, NIPS.

[6]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[7]  Andrew G. Barto,et al.  Robot Weightlifting By Direct Policy Search , 2001, IJCAI.

[8]  Jeff G. Schneider,et al.  Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[9]  K. Dautenhahn,et al.  The correspondence problem , 2002 .

[10]  Jeff G. Schneider,et al.  Covariant Policy Search , 2003, IJCAI.

[11]  Jun Nakanishi,et al.  Learning Movement Primitives , 2005, ISRR.

[12]  Sridhar Mahadevan,et al.  Hierarchical Policy Gradient Algorithms , 2003, ICML.

[13]  Jun Morimoto,et al.  Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[14]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[15]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[16]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  KasabovNikola,et al.  2008 Special issue , 2008 .

[18]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[19]  Marc Toussaint,et al.  Model-free reinforcement learning as mixture learning , 2009, ICML '09.

[20]  Heni Ben Amor,et al.  Kinesthetic Bootstrapping: Teaching Motor Skills to Humanoid Robots through Physical Interaction , 2009, KI.

[21]  Martin A. Riedmiller,et al.  Reinforcement learning for robot soccer , 2009, Auton. Robots.

[22]  Jan Peters,et al.  Policy Search for Motor Primitives , 2009, Künstliche Intell..

[23]  Stefan Schaal,et al.  Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[24]  Christoph H. Lampert,et al.  Movement templates for learning of hitting and batting , 2010, 2010 IEEE International Conference on Robotics and Automation.

[25]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[26]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[27]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[28]  Gerhard Neumann,et al.  Variational Inference for Policy Search in changing situations , 2011, ICML.

[29]  Doina Precup,et al.  An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.

[30]  Jan Peters,et al.  Hierarchical Relative Entropy Policy Search , 2014, AISTATS.