论文信息 - Latent space policy search for robotics

Latent space policy search for robotics

Learning motor skills for robots is a hard task. In particular, a high number of degrees-of-freedom in the robot can pose serious challenges to existing reinforcement learning methods, since it leads to a high-dimensional search space. However, complex robots are often intrinsically redundant systems and, therefore, can be controlled using a latent manifold of much smaller dimensionality. In this paper, we present a novel policy search method that performs efficient reinforcement learning by uncovering the low-dimensional latent space of actuator redundancies. In contrast to previous attempts at combining reinforcement learning and dimensionality reduction, our approach does not perform dimensionality reduction as a preprocessing step but naturally combines it with policy search. Our evaluations show that the new approach outperforms existing algorithms for learning motor skills with high-dimensional robots.

[1] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[2] Christian D. Swinehart,et al. Dimensional reduction for reward-based learning , 2006, Network.

[3] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[4] A. Rukhin. Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.

[5] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[6] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[7] Andrew Y. Ng,et al. Learning omnidirectional path following using dimensionality reduction , 2007, Robotics: Science and Systems.

[8] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[9] Jan Peters,et al. Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10] Frédéric Alexandre,et al. Reinforcement learning and dimensionality reduction: A model in computational neuroscience , 2011, The 2011 International Joint Conference on Neural Networks.

[11] Aaron Hertzmann,et al. Style-based inverse kinematics , 2004, SIGGRAPH 2004.

[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13] Andrew Y. Ng,et al. The Stanford LittleDog: A learning and rapid replanning approach to quadruped locomotion , 2011, Int. J. Robotics Res..

[14] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[15] Lena H Ting,et al. Subject-specific muscle synergies in human balance control are consistent across different biomechanical contexts. , 2010, Journal of neurophysiology.

[16] Sethu Vijayakumar,et al. Using dimensionality reduction to exploit constraints in reinforcement learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[18] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[19] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .

[20] K. Dautenhahn,et al. Imitation in Animals and Artifacts , 2002 .

[21] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[22] H. Bergman,et al. Information processing, dimensionality reduction and reinforcement learning in the basal ganglia , 2003, Progress in Neurobiology.