Bayesian optimization for learning gaits under uncertainty

Designing gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parameterization, finding nearoptimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date, such as grid search and evolutionary algorithms. In this article, we thoroughly discuss multiple of these optimization methods in the context of automatic gait optimization. Moreover, we extensively evaluate Bayesian optimization, a model-based approach to blackbox optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments.

[1]  Jan Peters,et al.  An experimental comparison of Bayesian optimization for bipedal locomotion , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Jan Peters,et al.  Bayesian Gait Optimization for Bipedal Locomotion , 2014, LION.

[3]  Katsu Yamane,et al.  Geometry and Biomechanics for Locomotion Synthesis and Control , 2013 .

[4]  André Seyfarth,et al.  Robots in human biomechanics—a study on ankle push-off in walking , 2012, Bioinspiration & biomimetics.

[5]  D. Lizotte,et al.  An experimental methodology for response surface optimization methods , 2012, J. Glob. Optim..

[6]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[7]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[8]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[9]  James S. Welsh,et al.  Evaluation of walk optimisation techniques for the NAO robot , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[10]  Howie Choset,et al.  Using response surfaces and expected improvement to optimize snake robot gait parameters , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[12]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[13]  Roman Garnett,et al.  Bayesian optimization for sensor set selection , 2010, IPSN '10.

[14]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[15]  Oskar von Stryk,et al.  Efficient Walking Speed Optimization of a Humanoid Robot , 2009, Int. J. Robotics Res..

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[17]  Michael A. Osborne,et al.  Gaussian Processes for Global Optimization , 2008 .

[18]  Tao Wang,et al.  Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.

[19]  Cord Niehaus,et al.  Gait Optimization on a Humanoid Robot using Particle Swarm Optimization , 2007 .

[20]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[21]  Florentin Wörgötter,et al.  Fast Biped Walking with a Sensor-driven Neuronal Controller and Real-time Online Learning , 2006, Int. J. Robotics Res..

[22]  Changjiu Zhou,et al.  Humanoid Walking Gait Optimization Using GA-Based Neural Network , 2005, ICNC.

[23]  Manuela M. Veloso,et al.  An evolutionary approach to gait learning for four-legged robots , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[24]  H. Sebastian Seung,et al.  Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[25]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[26]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[27]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[28]  D. Dennis,et al.  SDO : A Statistical Method for Global Optimization , 1997 .

[29]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[30]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[31]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[32]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[33]  Samuel H. Brooks A Discussion of Random Methods for Seeking Maxima , 1958 .

[34]  P. Gibbons,et al.  Optimisation of Dynamic Gait for Small Bipedal Robots , 2022 .