论文信息 - Reinforcement Learning to Adjust Robot Movements to New Situations

Reinforcement Learning to Adjust Robot Movements to New Situations

Many complex robot motor skills can be represented using elementary movements, and there exist efficient techniques for learning parametrized motor plans using demonstrations and self-improvement. However with current techniques, in many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor plan exists that covers a related situation. A method is needed that modulates the elementary movement through the meta-parameters of its representation. In this paper, we describe how to learn such mappings from circumstances to meta-parameters using reinforcement learning. In particular we use a kernelized version of the reward-weighted regression. We show two robot applications of the presented setup in robotic domains; the generalization of throwing movements in darts, and of hitting movements in table tennis. We demonstrate that both tasks can be learned successfully using simulated and real robots.

[1] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[2] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[3] R. Grupen. Learning Robot Control - Using Control Policies as Abstract Actions , 1998 .

[4] Stefan Schaal,et al. Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.

[5] Stefan Schaal,et al. Rapid synchronization and accurate phase-locking of rhythmic motor primitives , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[7] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[8] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[9] Rajesh P. N. Rao,et al. Learning nonparametric policies by imitation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[11] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[12] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[13] J. Leeds. Attention and Motor Skill Learning , 2007 .

[14] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .

[15] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[16] Alin Albu-Schäffer,et al. Learning from demonstration: repetitive movements for autonomous service robotics , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[17] Jun Morimoto,et al. Learning from demonstration and adaptation of biped locomotion , 2004, Robotics Auton. Syst..

[18] Stefan Schaal,et al. Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[19] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[20] Stefan Schaal,et al. Dynamics systems vs. optimal control--a unifying view. , 2007, Progress in brain research.

[21] Gordon Cheng,et al. Learning to Act from Observation and Practice , 2004, Int. J. Humanoid Robotics.

[22] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.

[24] Marc Toussaint,et al. Trajectory prediction: learning to map situations to robot trajectories , 2009, ICML '09.

[25] Noah J. Cowan,et al. Efficient Gradient Estimation for Motor Control Learning , 2002, UAI.

[26] T. Michael Knasel,et al. Robotics and autonomous systems , 1988, Robotics Auton. Syst..

[27] Jan Peters,et al. Learning table tennis with a Mixture of Motor Primitives , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.