论文信息 - Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods - 字舞流文

Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods

Richard S. Sutton | Jaeyoung Lee | R. Sutton | Jaeyoung Lee

[1] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.

[2] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[3] Jae Young Lee,et al. Policy Iteration for Discounted Reinforcement Learning Problems in Continuous Time and Space , 2017 .

[4] Derong Liu,et al. Data-based robust optimal control of continuous-time affine nonlinear systems with matched uncertainties , 2016, Inf. Sci..

[5] Frank L. Lewis,et al. Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[6] Frank L. Lewis,et al. H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning , 2016, Unmanned Syst..

[7] Vladimir Gaitsgory,et al. Stabilization with discounted optimal control , 2015, Syst. Control. Lett..

[8] Jae Young Lee,et al. Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9] Frank L. Lewis,et al. Adaptive Suboptimal Output-Feedback Control for Linear Systems Using Integral Reinforcement Learning , 2015, IEEE Transactions on Control Systems Technology.

[10] Zhong-Ping Jiang,et al. Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..

[11] Frank L. Lewis,et al. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.

[12] Jae Young Lee,et al. On integral generalized policy iteration for continuous-time linear quadratic regulations , 2014, Autom..

[13] Wulfram Gerstner,et al. Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons , 2013, PLoS Comput. Biol..

[14] P. Olver. Nonlinear Systems , 2013 .

[15] Jae Young Lee,et al. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..

[16] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[17] Sean P. Meyn,et al. Q-learning and Pontryagin's Minimum Principle , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[18] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[19] Frank L. Lewis,et al. 2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[20] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[21] Frank L. Lewis,et al. Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[22] W. Haddad,et al. Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach , 2008 .

[23] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25] J. Murray,et al. The Adaptive Dynamic Programming Theorem , 2003 .

[26] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[27] W. A. Kirk,et al. Handbook of metric fixed point theory , 2001 .

[28] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[29] Randal W. Beard,et al. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[30] J. Doyle,et al. Essentials of Robust Control , 1997 .

[31] S. Lyashevskiy. Constrained optimization and control of nonlinear systems: new results in optimal control , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[32] R. Sundaram. A First Course in Optimization Theory , 1996 .

[33] Chi-Tsong Chen,et al. Linear System Theory and Design , 1995 .

[34] Leiba Rodman,et al. Algebraic Riccati equations , 1995 .

[35] V. Mehrmann. The Autonomous Linear Quadratic Control Problem: Theory and Numerical Solution , 1991 .

[36] A. Bruckner,et al. Elementary Real Analysis , 1991 .

[37] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .

[38] Gerald B. Folland,et al. Real Analysis: Modern Techniques and Their Applications , 1984 .

[39] W. Arnold,et al. Numerical Solution of Algebraic Matrix Riccati Equations. , 1984 .

[40] George N. Saridis,et al. An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[41] D. Kleinman. On an iterative technique for Riccati equation computations , 1968 .

[42] Ruey-Wen Liu,et al. Construction of Suboptimal Control Sequences , 1967 .

[43] Z. Rekasius,et al. Suboptimal design of intentionally nonlinear controllers , 1964 .

[44] W. Rudin. Principles of mathematical analysis , 1964 .

[45] R. Howard. Dynamic Programming and Markov Processes , 1960 .

[46] C. Bessaga. On the converse of Banach "fixed-point principle" , 1959 .

[47] S. Banach. Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales , 1922 .

[48] T. H. Gronwall. Note on the Derivatives with Respect to a Parameter of the Solutions of a System of Differential Equations , 1919 .

[49] L. Brouwer. Beweis der Invarianz desn-dimensionalen Gebiets , 1911 .