Generalization in Reinforcement Learning: Safely Approximating the Value Function
暂无分享,去创建一个
[1] R. Bellman,et al. Polynomial approximation—a new computational technique in dynamic programming: Allocation processes , 1962 .
[2] R. Lathe. Phd by thesis , 1988, Nature.
[3] W. Cleveland,et al. Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .
[4] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[5] Richard Yee,et al. Abstraction in Control Learning , 1992 .
[6] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[7] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[8] Steven J. Bradtke,et al. Reinforcement Learning Applied to Linear Quadratic Regulation , 1992, NIPS.
[9] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[10] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..