Distributionally Robust Trajectory Optimization Under Uncertain Dynamics via Relative-Entropy Trust Regions

Trajectory optimization and model predictive control are essential techniques underpinning advanced robotic applications, ranging from autonomous driving to full-body humanoid control. State-of-the-art algorithms have focused on data-driven approaches that infer the system dynamics online and incorporate posterior uncertainty during planning and control. Despite their success, such approaches are still susceptible to catastrophic errors that may arise due to statistical learning biases, unmodeled disturbances or even directed adversarial attacks. In this paper, we tackle the problem of dynamics mismatch and propose a distributionally robust optimal control formulation that alternates between two relative-entropy trust region optimization problems. Our method finds the worstcase maximum-entropy Gaussian posterior over the dynamics parameters and the corresponding robust optimal policy. We show that our approach admits a closed-form backward-pass for a certain class of systems and demonstrate the resulting robustness on linear and nonlinear numerical examples.

[1]  Sanjay Mehrotra,et al.  Distributionally Robust Optimization: A Review , 2019, ArXiv.

[2]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[3]  Marc Peter Deisenroth,et al.  Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control , 2017, AISTATS.

[4]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[5]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[6]  Jan Peters,et al.  State-Regularized Policy Search for Linearized Dynamical Systems , 2017, ICAPS.

[7]  Panagiotis Patrinos,et al.  Data-driven distributionally robust LQR with multiplicative noise , 2019, L4DC.

[8]  荒木 望 Unscented Kalman Filterの計測への応用に関する研究 , 2007 .

[9]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[10]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[11]  Alexander Liniger,et al.  Cautious NMPC with Gaussian Process Dynamics for Autonomous Miniature Race Cars , 2017, 2018 European Control Conference (ECC).

[12]  Arno Solin,et al.  Cubature Integration Methods in Non-Linear Kalman Filtering and Smoothing , 2010 .

[13]  Yinyu Ye,et al.  Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems , 2010, Oper. Res..

[14]  John Lygeros,et al.  Regularized and Distributionally Robust Data-Enabled Predictive Control , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[15]  Zhaolin Hu,et al.  Kullback-Leibler divergence constrained distributionally robust optimization , 2012 .

[16]  Jonas Buchli,et al.  Risk Sensitive, Nonlinear Optimal Control: Iterative Linear Exponential-Quadratic Optimal Control with Gaussian Noise , 2015, ArXiv.

[17]  Ian R. Petersen,et al.  Minimax optimal control of stochastic uncertain systems with relative entropy constraints , 2000, IEEE Trans. Autom. Control..

[18]  Hany Abdulsamad,et al.  Optimal control and inverse optimal control by distribution matching , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Charalambos D. Charalambous,et al.  Stochastic Uncertain Systems Subject to Relative Entropy Constraints: Induced Norms and Monotonicity Properties of Minimax Games , 2007, IEEE Transactions on Automatic Control.

[20]  D. Mayne A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[21]  Adrien Gaidon,et al.  RAT iLQR: A Risk Auto-Tuning Controller to Optimally Account for Stochastic Model Mismatch , 2020, IEEE Robotics and Automation Letters.

[22]  Daniel Kuhn,et al.  Distributionally Robust Control of Constrained Stochastic Systems , 2016, IEEE Transactions on Automatic Control.

[23]  Insoon Yang,et al.  Wasserstein Distributionally Robust Stochastic Control: A Data-Driven Approach , 2018, IEEE Transactions on Automatic Control.

[24]  Bernhard Schölkopf,et al.  Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[25]  Jan Peters,et al.  Entropic Risk Measure in Policy Search , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Insoon Yang,et al.  Minimax Control of Ambiguous Linear Stochastic Systems Using the Wasserstein Metric , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[27]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[28]  Herbert E. Scarf,et al.  A Min-Max Solution of an Inventory Problem , 1957 .