Temporal Difference Learning of Position Evaluation in the Game of Go
暂无分享,去创建一个
Terrence J. Sejnowski | Peter Dayan | Nicol N. Schraudolph | P. Dayan | T. Sejnowski | N. Schraudolph
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[4] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[5] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[7] Herbert D. Enderton. The Golem Go Program , 1991 .
[8] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[9] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.