Optimistic planning with an adaptive number of action switches for near-optimal nonlinear control

Abstract We consider infinite-horizon optimal control of nonlinear systems where the control actions are discrete, and focus on optimistic planning algorithms from artificial intelligence, which can handle general nonlinear systems with nonquadratic costs. With the main goal of reducing computations, we introduce two such algorithms that only search for constrained action sequences. The constraint prevents the sequences from switching between different actions more than a limited number of times. We call the first method optimistic switch-limited planning (OSP), and develop analysis showing that its fixed number of switches S leads to polynomial complexity in the search horizon, in contrast to the exponential complexity of the existing OP algorithm for deterministic systems; and to a correspondingly faster convergence towards optimality. Since tuning S is difficult, we introduce an adaptive variant called OASP that automatically adjusts S so as to limit computations while ensuring that near-optimal solutions keep being explored. OSP and OASP are analytically evaluated in representative special cases, and numerically illustrated in simulations of a rotational pendulum. To show that the algorithms also work in challenging applications, OSP is used to control the pendulum in real time, while OASP is applied for trajectory control of a simulated quadrotor.

[1]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[2]  Rémi Munos,et al.  Open Loop Optimistic Planning , 2010, COLT.

[3]  Alberto Bemporad,et al.  On Hybrid Model Predictive Control of Sewer Networks , 2007 .

[4]  Rémi Munos,et al.  From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning , 2014, Found. Trends Mach. Learn..

[5]  Lucian Busoniu,et al.  Optimistic planning for Markov decision processes , 2012, AISTATS.

[6]  Anuradha M. Annaswamy,et al.  Adaptive Control of Quadrotor UAVs: A Design Trade Study With Flight Evaluations , 2013, IEEE Transactions on Control Systems Technology.

[7]  Alberto Bemporad,et al.  An algorithm for multi-parametric quadratic programming and explicit MPC solutions , 2003, Autom..

[8]  B. De Schutter,et al.  Time-instant optimization for hybrid model predictive control of the Rhine–Meuse delta , 2010 .

[9]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[10]  Donald E. Kirk,et al.  Optimal Control Theory , 1970 .

[11]  Stefan Edelkamp,et al.  Heuristic Search - Theory and Applications , 2011 .

[12]  Claudio De Persis,et al.  Robust Self-Triggered Coordination With Ternary Controllers , 2012, IEEE Transactions on Automatic Control.

[13]  Bart De Schutter,et al.  Optimistic planning with a limited number of action switches for near-optimal nonlinear control , 2014, 53rd IEEE Conference on Decision and Control.

[14]  A. Hoorfar,et al.  INEQUALITIES ON THE LAMBERTW FUNCTION AND HYPERPOWER FUNCTION , 2008 .

[15]  B. De Schutter Optimal traffic light control for a single intersection , 1999 .

[16]  João C. N. Clímaco,et al.  A review of interactive methods for multiobjective integer and mixed-integer programming , 2007, Eur. J. Oper. Res..

[17]  Alberto Bemporad,et al.  The explicit linear quadratic regulator for constrained systems , 2003, Autom..

[18]  Paulo Tabuada,et al.  Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks , 2007, IEEE Transactions on Automatic Control.

[19]  Frank L. Lewis,et al.  Optimal Control: Lewis/Optimal Control 3e , 2012 .

[20]  Rémi Munos,et al.  Optimistic Planning of Deterministic Systems , 2008, EWRL.

[21]  Thijs Wensveen,et al.  Real-Time Optimistic Planning with Action Sequences , 2015, 2015 20th International Conference on Control Systems and Computer Science.

[22]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[23]  Olivier Teytaud,et al.  Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[24]  Michael Cantoni,et al.  A {0, 1} linear program for fixed-profile load scheduling and demand management in automated irrigation channels , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[25]  Wen-Hua Chen,et al.  Piecewise constant model predictive control for autonomous helicopters , 2011, Robotics Auton. Syst..

[26]  Tor Arne Johansen,et al.  Using hash tables to manage the time-storage complexity in a point location problem: Application to explicit model predictive control , 2011, Autom..

[27]  Xiaoyan Ma,et al.  Analytical expression of explicit MPC solution via lattice piecewise-affine function , 2009, Autom..

[28]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[29]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .

[30]  Bart De Schutter,et al.  Optimal Control of a Class of Linear Hybrid Systems with Saturation , 1999, SIAM J. Control. Optim..

[31]  Xiang Li,et al.  Model predictive control with robust feasibility , 2011 .

[32]  Michael L. Littman,et al.  Sample-Based Planning for Continuous Action Markov Decision Processes , 2011, ICAPS.