Active Learning of Parameterized Skills

We introduce a method for actively learning parameterized skills. Parameterized skills are flexible behaviors that can solve any task drawn from a distribution of parameterized reinforcement learning problems. Approaches to learning such skills have been proposed, but limited attention has been given to identifying which training tasks allow for rapid skill acquisition. We construct a non-parametric Bayesian model of skill performance and derive analytical expressions for a novel acquisition criterion capable of identifying tasks that maximize expected improvement in skill performance. We also introduce a spatiotemporal kernel tailored for non-stationary skill performance models. The proposed method is agnostic to policy and skill representation and scales independently of task dimensionality. We evaluate it on a non-linear simulated catapult control problem over arbitrarily mountainous terrains.

[1]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[2]  K. A. Ericsson,et al.  The Influence of Experience and Deliberate Practice on the Development of Superior Expert Performance , 2006 .

[3]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[4]  Oliver Kroemer,et al.  Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..

[5]  Wolfram Burgard,et al.  Learning Non-stationary System Dynamics Online Using Gaussian Processes , 2010, DAGM-Symposium.

[6]  Peter Stone,et al.  Learning Powerful Kicks on the Aibo ERS-7: The Quest for a Striker , 2010, RoboCup.

[7]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[8]  Miguel Lázaro-Gredilla,et al.  Estimation of the forgetting factor in kernel recursive least squares , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[9]  David Silver,et al.  Active learning from demonstration for robust autonomous navigation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[10]  Jan Peters,et al.  Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .

[11]  Bruno Castro da Silva,et al.  Learning Parameterized Skills , 2012, ICML.

[12]  Scott Kuindersma,et al.  Variational Bayesian Optimization for Runtime Risk-Sensitive Control , 2012, Robotics: Science and Systems.

[13]  Jan Peters,et al.  Information-Theoretic Motor Skill Learning , 2013, AAAI 2013.

[14]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[15]  Peter Englert,et al.  Multi-task policy search for robotics , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).