Incremental Basis Construction from Temporal Difference Error
暂无分享,去创建一个
[1] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[2] Bo Liu,et al. Basis Construction from Power Series Expansions of Value Functions , 2010, NIPS.
[3] Andrew G. Barto,et al. Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.
[4] Robert Givan,et al. Feature-Discovering Approximate Value Iteration Methods , 2005, SARA.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[7] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[8] Alborz Geramifard,et al. Incremental Least-Squares Temporal Difference Learning , 2006, AAAI.
[9] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[10] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[11] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[12] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .
[13] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.
[14] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[15] Alessandro Lazaric,et al. LSTD with Random Projections , 2010, NIPS.
[16] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[17] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[18] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.
[19] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[20] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[21] Byron Boots,et al. Predictive State Temporal Difference Learning , 2010, NIPS.