Finite-time System Identification and Adaptive Control in Autoregressive Exogenous Systems

Autoregressive exogenous (ARX) systems are the general class of input-output dynamical system used for modeling stochastic linear dynamical system (LDS) including partially observable LDS such as LQG systems. In this work, we study the problem of system identification and adaptive control of unknown ARX systems. We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection. Using these guarantees, we design adaptive control algorithms for unknown ARX systems with arbitrary strongly convex or non-strongly convex quadratic regulating costs. Under strongly convex cost functions, we design an adaptive control algorithm based on online gradient descent to design and update the controllers that are constructed via a convex controller reparametrization. We show that our algorithm has Õ( √ T ) regret via explore and commit approach and if the model estimates are updated in epochs using closed-loop data collection, it attains the optimal regret of polylog(T ) after T time-steps of interaction. For the case of non-strongly convex quadratic cost functions, we propose an adaptive control algorithm that deploys the optimism in the face of uncertainty principle to design the controller. In this setting, we show that the explore and commit approach has a regret upper bound of Õ(T ), and the adaptive control with continuous model estimate updates attains Õ( √ T ) regret after T time-steps.

[1]  B. Moor,et al.  Closed loop subspace system identification , 1997 .

[2]  Alessandro Lazaric,et al.  Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems , 2018, ICML.

[3]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[4]  Lennart Ljung,et al.  Closed-loop identification revisited , 1999, Autom..

[5]  Avinatan Hassidim,et al.  Online Linear Quadratic Control , 2018, ICML.

[6]  Β. L. HO,et al.  Editorial: Effective construction of linear state-variable models from input/output functions , 1966 .

[7]  Babak Hassibi,et al.  Regret Minimization in Partially Observable Linear Quadratic Control , 2020, ArXiv.

[8]  Maria Prandini,et al.  Adaptive LQG Control of Input-Output Systems---A Cost-biased Approach , 2000, SIAM J. Control. Optim..

[9]  Yishay Mansour,et al.  Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret , 2019, ICML.

[10]  Karan Singh,et al.  Logarithmic Regret for Online Control , 2019, NeurIPS.

[11]  Max Simchowitz,et al.  Improper Learning for Non-Stochastic Control , 2020, COLT.

[12]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[13]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[14]  Max Simchowitz,et al.  Naive Exploration is Optimal for Online LQR , 2020, ICML.

[15]  Sham M. Kakade,et al.  Online Control with Adversarial Disturbances , 2019, ICML.

[16]  Victor Vazquez,et al.  On the usefulness of persistent excitation in ARX adaptive tracking , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[17]  Tyrone L. Vincent,et al.  Compressive System Identification of LTI and LTV ARX models , 2011, IEEE Conference on Decision and Control and European Control Conference.

[18]  Kamyar Azizzadenesheli,et al.  Regret Bound of Adaptive Control in Linear Quadratic Gaussian (LQG) Systems , 2020, ArXiv.

[19]  Samet Oymak,et al.  Non-asymptotic Identification of LTI Systems from a Single Trajectory , 2018, 2019 American Control Conference (ACC).

[20]  Ahmet Palazoglu,et al.  Model predictive control based on Wiener models , 1998 .

[21]  Vladimir Stojanovic,et al.  Optimal experiment design for identification of ARX models with constrained output in non-Gaussian noise , 2016 .

[22]  Benjamin Recht,et al.  Certainty Equivalent Control of LQR is Efficient , 2019, ArXiv.

[23]  Max Simchowitz,et al.  Learning Linear Dynamical Systems with Semi-Parametric Least Squares , 2019, COLT.

[24]  D. Kass,et al.  Parametric model derivation of transfer function for noninvasive estimation of aortic pressure by radial tonometry , 1999, IEEE Transactions on Biomedical Engineering.

[25]  P. Kumar,et al.  Adaptive Linear Quadratic Gaussian Control: The Cost-Biased Approach Revisited , 1998 .

[26]  Henrik Madsen,et al.  Online short-term solar power forecasting , 2009 .

[27]  Nikolai Matni,et al.  Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.

[28]  Dante C. Youla,et al.  Modern Wiener-Hopf Design of Optimal Controllers. Part I , 1976 .

[29]  Kamyar Azizzadenesheli,et al.  Explore More and Improve Regret in Linear Quadratic Regulators , 2020, ArXiv.

[30]  Alessandro Chiuso,et al.  Consistency analysis of some closed-loop subspace identification methods , 2005, Autom..

[31]  M. Campi,et al.  A self-optimizing adaptive LQG control scheme for input-output systems , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).

[32]  Ambuj Tewari,et al.  Input Perturbations for Adaptive Regulation and Learning , 2018, ArXiv.

[33]  Peter Auer,et al.  Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..

[34]  Erik Weyer,et al.  Guaranteed non-asymptotic confidence regions in system identification , 2005, Autom..

[35]  Michel Verhaegen,et al.  Identification of the deterministic part of MIMO state space models given in innovations form from input-output data , 1994, Autom..

[36]  Babak Hassibi,et al.  Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems , 2020, NeurIPS.

[37]  Erik Weyer,et al.  Finite sample properties of system identification methods , 2002, IEEE Trans. Autom. Control..

[38]  Csaba Szepesvári,et al.  Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.

[39]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[40]  Munther A. Dahleh,et al.  Finite-Time System Identification for Partially Observed LTI Systems of Unknown Order , 2019, ArXiv.

[41]  T. Lai,et al.  Asymptotically efficient self-tuning regulators , 1987 .

[42]  Magnus Jansson,et al.  Subspace Identification and ARX Modeling , 2003 .

[43]  Ambuj Tewari,et al.  On adaptive Linear-Quadratic regulators , 2020, Autom..

[44]  Ambuj Tewari,et al.  Optimism-Based Adaptive Regulation of Linear-Quadratic Systems , 2017, IEEE Transactions on Automatic Control.

[45]  Kamyar Azizzadenesheli,et al.  Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting , 2020, 2021 American Control Conference (ACC).

[46]  J. W. Nieuwenhuis,et al.  Boekbespreking van D.P. Bertsekas (ed.), Dynamic programming and optimal control - volume 2 , 1999 .

[47]  Claude-Nicolas Fiechter,et al.  PAC adaptive control of linear systems , 1997, COLT '97.

[48]  George J. Pappas,et al.  Finite Sample Analysis of Stochastic System Identification , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[49]  T. Lai,et al.  Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .

[50]  Kuang Yu Huang,et al.  A hybrid model for stock market forecasting and portfolio selection based on ARX, grey system and RS theories , 2009, Expert Syst. Appl..

[51]  Umberto Soverini,et al.  Identification of ARX and ARARX Models in the Presence of Input and Output Noises , 2010, Eur. J. Control.

[52]  P. de Chazal,et al.  A parametric feature extraction and classification strategy for brain-computer interfacing , 2005, IEEE Transactions on Neural Systems and Rehabilitation Engineering.