Least-Squares Temporal Difference Learning
暂无分享,去创建一个
[1] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .
[2] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[3] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[4] William H. Press,et al. Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10) , 1994 .
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[7] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[8] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[9] Andrew W. Moore,et al. Learning evaluation functions for global optimization , 1998 .
[10] Andrew W. Moore,et al. Learning Evaluation Functions for Global Optimization and Boolean Satisfiability , 1998, AAAI/IAAI.