A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
暂无分享,去创建一个
Demis Hassabis | Karen Simonyan | David Silver | Matthew Lai | Thore Graepel | Thomas Hubert | Julian Schrittwieser | Ioannis Antonoglou | Arthur Guez | Marc Lanctot | Laurent Sifre | Dharshan Kumaran | Timothy Lillicrap | L. Sifre | T. Lillicrap | D. Hassabis | D. Silver | A. Guez | Ioannis Antonoglou | D. Kumaran | T. Graepel | Marc Lanctot | K. Simonyan | T. Hubert | Julian Schrittwieser | Matthew Lai | David Silver
[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[2] Claude E. Shannon,et al. XXII. Programming a Computer for Playing Chess 1 , 1950 .
[3] Emmanuel Lasker,et al. Common Sense in Chess , 1965 .
[4] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[5] Donald E. Knuth,et al. An Analysis of Alpha-Beta Pruning , 1975, Artif. Intell..
[6] Stuart C. Shapiro,et al. Encyclopedia of artificial intelligence, vols. 1 and 2 (2nd ed.) , 1992 .
[7] L. V. Allis,et al. Searching for solutions in games and artificial intelligence , 1994 .
[8] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[9] Sebastian Thrun,et al. Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.
[10] B. Pell. A STRATEGIC METAGAME PLAYER FOR GENERAL CHESS‐LIKE GAMES , 1994, Comput. Intell..
[11] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[12] Donald F. Beal,et al. Temporal Difference Learning for Heuristic Search and Game Playing , 2000, Inf. Sci..
[13] Donald F. Beal,et al. Temporal difference learning applied to game playing and the results of application to Shogi , 2001, Theor. Comput. Sci..
[14] Hiroyuki Iida,et al. Computer shogi , 2002, Artif. Intell..
[15] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..
[16] Gerald Tesauro,et al. Programming backgammon using self-teaching neural nets , 2002, Artif. Intell..
[17] Feng-Hsiung Hsu,et al. Behind Deep Blue: Building the Computer that Defeated the World Chess Champion , 2002 .
[18] Wei-Yin Loh,et al. A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.
[19] Andrew Tridgell,et al. Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.
[20] Michael R. Genesereth,et al. General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..
[21] Rémi Coulom,et al. Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength , 2008, Computers and Games.
[22] Joel Veness,et al. Bootstrapping from Game Tree Search , 2009, NIPS.
[23] Christopher D. Rosin,et al. Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.
[24] Tomoyuki Kaneko,et al. Large-Scale Optimization for Evaluation Functions with Minimax Search , 2014, J. Artif. Intell. Res..
[25] David Silver,et al. Move Evaluation in Go Using Deep Convolutional Neural Networks , 2014, ICLR.
[26] Matthew Lai,et al. Giraffe: Using Deep Reinforcement Learning to Play Chess , 2015, ArXiv.
[27] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[28] Nathan S. Netanyahu,et al. DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess , 2016, ICANN.
[29] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[30] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[31] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.