论文信息 - Active online learning of the bipedal walking

Active online learning of the bipedal walking

For legged robot walking pattern learning, the current mainstream and state-of-the-art researches are most under a so-called computer simulation based framework, where the walking pattern is learned via a pre-established simulation platform. However, when the learned walking pattern is applied to a real robot, an additional adapting procedure is always required, due to the big difference between simulation and real walking circumstances. This turns out to be more critical for a bipedal walking, because its controlling is more difficult than others, such as quadruped robot. In this paper, a novel framework for active online learning bipedal walking directly on a physical robot is proposed. To let the learning procedure to be of both fast convergence and high efficiency, a polynomial response surrogate model, an orthogonal experimental design based active learning strategy as well as a gradient ascent algorithm are used. The experimental results on a real humanoid robot PKU-HR3 show its effectiveness, indicating that the proposed learning framework is a promising alternative for bipedal walking pattern learning.

Yi Wang | Xihong Wu | Dingsheng Luo

[1] Kevin Tucker,et al. Response surface approximation of pareto optimal front in multi-objective optimization , 2004 .

[2] Dirk Thomas,et al. Versatile, High-Quality Motions and Behavior Control of a Humanoid Soccer Robot , 2008, Int. J. Humanoid Robotics.

[3] Claude Sammut,et al. Omnidirectional Locomotion for Quadruped Robots , 2001, RoboCup.

[4] Stephan K. Chalup,et al. Techniques for Improving Vision and Locomotion on the Sony AIBO Robot , 2003 .

[5] Cord Niehaus,et al. Gait Optimization on a Humanoid Robot using Particle Swarm Optimization , 2007 .

[6] Oskar von Stryk,et al. International Journal of Robotics Research , 2022 .

[7] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.

[8] Helio J. C. Barbosa,et al. A similarity-based surrogate model for expensive evolutionary optimization with fixed budget of simulations , 2009, 2009 IEEE Congress on Evolutionary Computation.

[9] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[10] Sven Behnke,et al. Stochastic optimization of bipedal walking using gyro feedback and phase resetting , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[11] Jerry E. Pratt,et al. Virtual model control of a bipedal walking robot , 1997, Proceedings of International Conference on Robotics and Automation.