暂无分享,去创建一个
Tom Schaul | David Silver | Ioannis Antonoglou | John Quan | T. Schaul | D. Silver | Ioannis Antonoglou | John Quan
[1] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[2] J. Diamond. Zebras and the Anna Karenina Principle , 1994 .
[3] Martin A. Riedmiller,et al. Rprop - Description and Implementation Details , 1994 .
[4] David Andre,et al. Generalized Prioritized Sweeping , 1997, NIPS.
[5] Richard K. Belew,et al. New Methods for Competitive Coevolution , 1997, Evolutionary Computation.
[6] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[7] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[8] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[9] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[10] David J. Foster,et al. Reverse replay of behavioural sequences in hippocampal place cells during the awake state , 2006, Nature.
[11] Geoffrey E. Hinton,et al. To recognize shapes, first learn to generate images. , 2007, Progress in Brain Research.
[12] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[13] L. Frank,et al. Rewarded Outcomes Enhance Reactivation of Experience in the Hippocampus , 2009, Neuron.
[14] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[15] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.
[16] Yi Sun,et al. Incremental Basis Construction from Temporal Difference Error , 2011, ICML.
[17] Alborz Geramifard,et al. Online Discovery of Feature Dependencies , 2011, ICML.
[18] Francisco Herrera,et al. A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] Richard S. Sutton,et al. Planning by Prioritized Sweeping with Small Backups , 2013, ICML.
[21] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[22] R. Sutton,et al. Surprise and Curiosity for Big Data Robotics , 2014 .
[23] Richard S. Sutton,et al. Weighted importance sampling for off-policy learning with linear function approximation , 2014, NIPS.
[24] D. Dupret,et al. Dopaminergic neurons promote hippocampal reactivation and spatial memory persistence , 2014, Nature Neuroscience.
[25] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[26] D. Hassabis,et al. Hippocampal place cells construct reward related sequences through unexplored space , 2015, eLife.
[27] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[28] Laura A. Atherton,et al. Memory trace replay: the shaping of memory consolidation by neuromodulation , 2015, Trends in Neurosciences.
[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[30] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[31] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[32] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[33] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[34] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.