Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods

[1]  Yann Ollivier,et al.  Making Deep Q-learning methods robust to time discretization , 2019, ICML.

[2]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[3]  Jae Young Lee,et al.  Policy Iteration for Discounted Reinforcement Learning Problems in Continuous Time and Space , 2017 .

[4]  Derong Liu,et al.  Data-based robust optimal control of continuous-time affine nonlinear systems with matched uncertainties , 2016, Inf. Sci..

[5]  Frank L. Lewis,et al.  Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[6]  Frank L. Lewis,et al.  H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning , 2016, Unmanned Syst..

[7]  Vladimir Gaitsgory,et al.  Stabilization with discounted optimal control , 2015, Syst. Control. Lett..

[8]  Jae Young Lee,et al.  Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Frank L. Lewis,et al.  Adaptive Suboptimal Output-Feedback Control for Linear Systems Using Integral Reinforcement Learning , 2015, IEEE Transactions on Control Systems Technology.

[10]  Zhong-Ping Jiang,et al.  Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..

[11]  Frank L. Lewis,et al.  Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.

[12]  Jae Young Lee,et al.  On integral generalized policy iteration for continuous-time linear quadratic regulations , 2014, Autom..

[13]  Wulfram Gerstner,et al.  Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons , 2013, PLoS Comput. Biol..

[14]  P. Olver Nonlinear Systems , 2013 .

[15]  Jae Young Lee,et al.  Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..

[16]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[17]  Sean P. Meyn,et al.  Q-learning and Pontryagin's Minimum Principle , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[18]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[19]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[20]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[21]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[22]  W. Haddad,et al.  Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach , 2008 .

[23]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  J. Murray,et al.  The Adaptive Dynamic Programming Theorem , 2003 .

[26]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[27]  W. A. Kirk,et al.  Handbook of metric fixed point theory , 2001 .

[28]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[29]  Randal W. Beard,et al.  Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[30]  J. Doyle,et al.  Essentials of Robust Control , 1997 .

[31]  S. Lyashevskiy Constrained optimization and control of nonlinear systems: new results in optimal control , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[32]  R. Sundaram A First Course in Optimization Theory , 1996 .

[33]  Chi-Tsong Chen,et al.  Linear System Theory and Design , 1995 .

[34]  Leiba Rodman,et al.  Algebraic Riccati equations , 1995 .

[35]  V. Mehrmann The Autonomous Linear Quadratic Control Problem: Theory and Numerical Solution , 1991 .

[36]  A. Bruckner,et al.  Elementary Real Analysis , 1991 .

[37]  B. Anderson,et al.  Optimal control: linear quadratic methods , 1990 .

[38]  Gerald B. Folland,et al.  Real Analysis: Modern Techniques and Their Applications , 1984 .

[39]  W. Arnold,et al.  Numerical Solution of Algebraic Matrix Riccati Equations. , 1984 .

[40]  George N. Saridis,et al.  An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[41]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[42]  Ruey-Wen Liu,et al.  Construction of Suboptimal Control Sequences , 1967 .

[43]  Z. Rekasius,et al.  Suboptimal design of intentionally nonlinear controllers , 1964 .

[44]  W. Rudin Principles of mathematical analysis , 1964 .

[45]  R. Howard Dynamic Programming and Markov Processes , 1960 .

[46]  C. Bessaga On the converse of Banach "fixed-point principle" , 1959 .

[47]  S. Banach Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales , 1922 .

[48]  T. H. Gronwall Note on the Derivatives with Respect to a Parameter of the Solutions of a System of Differential Equations , 1919 .

[49]  L. Brouwer Beweis der Invarianz desn-dimensionalen Gebiets , 1911 .