Robot trajectory optimization using approximate inference

The general stochastic optimal control (SOC) problem in robotics scenarios is often too complex to be solved exactly and in near real time. A classical approximate solution is to first compute an optimal (deterministic) trajectory and then solve a local linear-quadratic-gaussian (LQG) perturbation model to handle the system stochasticity. We present a new algorithm for this approach which improves upon previous algorithms like iLQG. We consider a probabilistic model for which the maximum likelihood (ML) trajectory coincides with the optimal trajectory and which, in the LQG case, reproduces the classical SOC solution. The algorithm then utilizes approximate inference methods (similar to expectation propagation) that efficiently generalize to non-LQG systems. We demonstrate the algorithm on a simulated 39-DoF humanoid robot.

[1]  Arthur E. Bryson,et al.  Applied Optimal Control , 1969 .

[2]  Bonaventure Intercontinental,et al.  ON DECISION AND CONTROL , 1985 .

[3]  R. Stengel Stochastic Optimal Control: Theory and Application , 1986 .

[4]  Ross D. Shachter Probabilistic Inference and Influence Diagrams , 1988, Oper. Res..

[5]  M. K rn,et al.  Stochastic Optimal Control , 1988 .

[6]  Gregory F. Cooper,et al.  A Method for Using Belief Networks as Influence Diagrams , 2013, UAI 1988.

[7]  Yao-Chon Chen Solving robot trajectory planning problems with uniform cubic B‐splines , 1991 .

[8]  Ross D. Shachter,et al.  Decision Making Using Probabilistic Inference Methods , 1992, UAI.

[9]  Jianwei Zhang,et al.  An Enhanced Optimization Approach for Generating Smooth Robot Trajectories in the Presence of Obstacles , 1995 .

[10]  Geoffrey E. Hinton,et al.  Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[11]  Maximilian Schlemmer,et al.  Real-Time Collision- Free Trajectory Optimization of Robot Manipulators via Semi-Infinite Parameter Optimization , 1998, Int. J. Robotics Res..

[12]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[13]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[14]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[15]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[16]  Hagai Attias,et al.  Planning by Probabilistic Inference , 2003, AISTATS.

[17]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[18]  Marc Toussaint,et al.  Probabilistic inference for solving (PO) MDPs , 2006 .

[19]  Marc Toussaint,et al.  Probabilistic inference for structured planning in robotics , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Emanuel Todorov,et al.  General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.