Advancing Trajectory Optimization with Approximate Inference: Exploration, Covariance Control and Adaptive Risk

Discrete-time stochastic optimal control remains a challenging problem for general, nonlinear systems under significant uncertainty, with practical solvers typically relying on the certainty equivalence assumption, replanning and/or extensive regularization. Control-as-inference is an approach that frames stochastic control as an equivalent inference problem, and has demonstrated desirable qualities over existing methods, namely in exploration and regularization. We look specifically at the input inference for control (I2C) algorithm, and derive three key characteristics that enable advanced trajectory optimization: An ‘expert’ linear Gaussian controller that combines the benefits of open-loop optima and closed-loop variance reduction when optimizing for nonlinear systems, adaptive risk sensitivity for regularized exploration, and performing covariance control through specifying the terminal state distribution.

[1]  Miroslav Kárný,et al.  Stochastic control optimal in the Kullback sense , 2008, Kybernetika.

[2]  Marc Toussaint,et al.  On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.

[3]  Suman Chakravorty,et al.  T-PFC: A Trajectory-Optimized Perturbation Feedback Control Approach , 2019, IEEE Robotics and Automation Letters.

[4]  P. Whittle Risk-sensitive linear/quadratic/gaussian control , 1981, Advances in Applied Probability.

[5]  W. Fleming Stochastic Control for Small Noise Intensities , 1971 .

[6]  Rhodes,et al.  Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .

[7]  Vicenç Gómez,et al.  Optimal control as a graphical model inference problem , 2009, Machine Learning.

[8]  Michael A. Osborne,et al.  Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[9]  Jan Peters,et al.  Stochastic Optimal Control as Approximate Input Inference , 2019, CoRL.

[10]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[11]  Yongxin Chen,et al.  Nonlinear Covariance Control via Differential Dynamic Programming , 2019, 2020 American Control Conference (ACC).

[12]  Yongxin Chen Modeling and control of collective dynamics: From Schrodinger bridges to Optimal Mass Transport , 2016 .

[13]  Yuval Tassa,et al.  Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.

[14]  Naonori Ueda,et al.  Deterministic Annealing Variant of the EM Algorithm , 1994, NIPS.

[15]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[16]  Y. Bar-Shalom,et al.  Dual effect, certainty equivalence, and separation in stochastic control , 1974 .

[17]  Panagiotis Tsiotras,et al.  Finite-horizon covariance control of linear time-varying systems , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Li Ping,et al.  The Factor Graph Approach to Model-Based Signal Processing , 2007, Proceedings of the IEEE.

[20]  Efstathios Bakolas,et al.  Greedy Finite-Horizon Covariance Steering for Discrete-Time Stochastic Nonlinear Systems Based on the Unscented Transform , 2020, 2020 American Control Conference (ACC).

[21]  Marc Toussaint,et al.  Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[22]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[23]  Tryphon T. Georgiou,et al.  Optimal Steering of a Linear Stochastic System to a Final Probability Distribution, Part I , 2016, IEEE Transactions on Automatic Control.

[24]  Christian Hoffmann,et al.  Linear Optimal Control on Factor Graphs — A Message Passing Perspective — , 2017 .

[25]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[26]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[27]  Arno Solin,et al.  Cubature Integration Methods in Non-Linear Kalman Filtering and Smoothing , 2010 .

[28]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[29]  Robert Skelton,et al.  A covariance control theory , 1985, 1985 24th IEEE Conference on Decision and Control.

[30]  A. Beghi,et al.  On the relative entropy of discrete-time Markov processes with given end-point densities , 1996, IEEE Trans. Inf. Theory.

[31]  K. Rawlik On probabilistic inference approaches to stochastic optimal control , 2013 .

[32]  Panganamala Ramana Kumar,et al.  T-LQG: Closed-loop belief space planning via trajectory-optimized LQG , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.