Variable Resolution Discretization in Optimal Control
暂无分享,去创建一个
[1] Donald E. Knuth,et al. Sorting and Searching , 1973 .
[2] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1976, TOMS.
[3] Hendrik Van Brussel,et al. A self-learning automaton with variable resolution for high precision assembly by industrial robots , 1982 .
[4] P. Lions,et al. Viscosity solutions of Hamilton-Jacobi equations , 1983 .
[5] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[6] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[7] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.
[8] G. Barles,et al. Convergence of approximation schemes for fully nonlinear second order equations , 1990, 29th IEEE Conference on Decision and Control.
[9] A. Moore. Variable Resolution Dynamic Programming , 1991, ML.
[10] Wolfgang Hackbusch,et al. Parallel algorithms for partial differential equations - Proceedings of the sixth GAMM-seminar - Kiel, January 19-21, 1990 , 1991 .
[11] G. Barles,et al. Convergence of approximation schemes for fully nonlinear second order equations , 1991 .
[12] Harald Niederreiter,et al. Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.
[13] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .
[14] D. Moore. Simplicial Mesh Generation with Applications , 1992 .
[15] P. Lions,et al. User’s guide to viscosity solutions of second order partial differential equations , 1992, math/9207212.
[16] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[17] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[18] A. Michael,et al. A Linear Programming Approach toSolving Stochastic Dynamic Programs , 1993 .
[19] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[20] Michael A. Trick,et al. A Linear Programming Approach to Solving Stochastic Dynamic Programming , 1993 .
[21] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[22] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[23] J. Quadrat. Numerical methods for stochastic control problems in continuous time , 1994 .
[24] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[25] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[26] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[27] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[28] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[29] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[30] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[31] Scott Davies,et al. Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.
[32] John Rust. Numerical dynamic programming in economics , 1996 .
[33] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[34] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[35] L. Grüne. An adaptive grid scheme for the discrete Hamilton-Jacobi-Bellman equation , 1997 .
[36] Rémi Munos,et al. Reinforcement Learning for Continuous Stochastic Control Problems , 1997, NIPS.
[37] Gary Boone,et al. Minimum-time control of the Acrobot , 1997, Proceedings of International Conference on Robotics and Automation.
[38] Andrew W. Moore,et al. Barycentric Interpolators for Continuous Space and Time Reinforcement Learning , 1998, NIPS.
[39] P. Dupuis,et al. Rates of Convergence for Approximation Schemes in Optimal Control , 1998 .
[40] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[41] Andrew W. Moore,et al. Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).
[42] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[43] Andrew W. Moore,et al. Rates of Convergence for Variable Resolution Schemes in Optimal Control , 2000, ICML.
[44] T. Sadahiro,et al. Landing control of acrobat robot (SMB) satisfying various constraints , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).
[45] Andrew W. Moore,et al. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.
[46] H. Bungartz,et al. Sparse grids , 2004, Acta Numerica.
[47] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[48] Paul Bourgine,et al. Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty , 1999, Machine Learning.
[49] Rémi Munos,et al. A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions , 2000, Machine Learning.
[50] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.