Error Bounds for Approximate Value Iteration
暂无分享,去创建一个
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[3] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.
[4] John Rust. Numerical dynamic programming in economics , 1996 .
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] S. Mallat,et al. Adaptive greedy approximations , 1997 .
[7] R. DeVore,et al. Nonlinear approximation , 1998, Acta Numerica.
[8] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[9] S. Mallat. A wavelet tour of signal processing , 1998 .
[10] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[11] Carlos Guestrin,et al. Max-norm Projections for Factored MDPs , 2001, IJCAI.
[12] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[13] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[14] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[15] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[16] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[17] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.