暂无分享,去创建一个
Marlos C. Machado | Erik Talvitie | Michael H. Bowling | Yitao Liang | Michael Bowling | Erik Talvitie | Yitao Liang
[1] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[2] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[3] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[4] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[7] Geoffrey J. Gordon. Reinforcement Learning with Function Approximation Converges to a Region , 2000, NIPS.
[8] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..
[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[10] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Jonathan Schaeffer,et al. Checkers Is Solved , 2007, Science.
[13] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.
[14] Jennifer Chu-Carroll,et al. Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..
[15] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[16] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[17] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[18] Marc G. Bellemare,et al. Sketch-Based Linear Value Function Approximation , 2012, NIPS.
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] Thore Graepel,et al. A Comparison of learning algorithms on the Arcade Learning Environment , 2014, ArXiv.
[21] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[22] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.
[23] Marc G. Bellemare,et al. Compress and Control , 2015, AAAI.
[24] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[25] Marlos C. Machado,et al. Domain-Independent Optimistic Initialization for Reinforcement Learning , 2014, AAAI Workshop: Learning for General Competency in Video Games.
[26] Hector Geffner,et al. Classical Planning with Simulators: Results on the Atari Video Games , 2015, IJCAI.
[27] N. Sprague. Parameter Selection for the Deep Q-Learning Algorithm , 2015 .
[28] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[29] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[30] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[31] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[32] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.