A Non-Parametric Approach to Dynamic Programming
暂无分享,去创建一个
[1] R Bellman,et al. Bottleneck Problems and Dynamic Programming. , 1953, Proceedings of the National Academy of Sciences of the United States of America.
[2] M. Rosenblatt. Remarks on Some Nonparametric Estimates of a Density Function , 1956 .
[3] R. E. Kalman,et al. Contributions to the Theory of Optimal Control , 1960 .
[4] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .
[5] E. Nadaraya. On Estimating Regression , 1964 .
[6] G. S. Watson,et al. Smooth regression analysis , 1964 .
[7] E. Davison,et al. The numerical solution of A'Q+QA =-C , 1968 .
[8] David Elkind,et al. Learning: An Introduction , 1968 .
[9] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .
[10] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .
[11] C. D. Kemp,et al. Density Estimation for Statistics and Data Analysis , 1987 .
[12] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[13] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[14] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[15] H. Bersini,et al. Three connectionist implementations of dynamic programming for optimal control: a preliminary comparative analysis , 1996, Proceedings of International Workshop on Neural Networks for Identification, Control, Robotics and Signal/Image Processing.
[16] K. Atkinson. The Numerical Solution of Integral Equations of the Second Kind , 1997 .
[17] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[18] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[19] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[20] Ralf Schoknecht,et al. Optimality of Reinforcement Learning Algorithms with Linear Function Approximation , 2002, NIPS.
[21] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[22] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[23] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[24] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[25] Rémi Munos,et al. Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation , 2005, J. Mach. Learn. Res..
[26] Xin Xu,et al. Kernel Least-Squares Temporal Difference Learning , 2006 .
[27] Peter Stone,et al. Model-based function approximation in reinforcement learning , 2007, AAMAS '07.
[28] Warren B. Powell,et al. Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .
[29] Gavin Taylor,et al. Kernelized value function approximation for reinforcement learning , 2009, ICML '09.
[30] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[31] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[32] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[33] Dominik Wied,et al. Consistency of the kernel density estimator: a survey , 2012 .
[34] Jan Peters,et al. A biomimetic approach to robot table tennis , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[35] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.