暂无分享,去创建一个
Marc G. Bellemare | David Silver | Mark Rowland | Will Dabney | Robert Dadashi | John Quan | Andr'e Barreto | André Barreto | Mark Rowland | Robert Dadashi | Will Dabney | John Quan | David Silver | M. Rowland
[1] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[2] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[3] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[4] Marc G. Bellemare,et al. Statistics and Samples in Distributional Reinforcement Learning , 2019, ICML.
[5] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[6] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[7] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[8] Lawrence Carin,et al. Linear Feature Encoding for Reinforcement Learning , 2016, NIPS.
[9] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[10] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[11] Henning Sprekeler,et al. On the Relation of Slow Feature Analysis and Laplacian Eigenmaps , 2011, Neural Computation.
[12] Karthik Sridharan,et al. Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.
[13] Yoshua Bengio,et al. Hyperbolic Discounting and Learning over Multiple Horizons , 2019, ArXiv.
[14] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[15] Matteo Hessel,et al. Deep Reinforcement Learning and the Deadly Triad , 2018, ArXiv.
[16] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[17] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[18] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[19] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[20] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[21] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[22] Tom Schaul,et al. Universal Successor Features Approximators , 2018, ICLR.
[23] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[24] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[27] U. Rieder,et al. Markov Decision Processes , 2010 .
[28] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[29] Nicolas Le Roux,et al. The Value Function Polytope in Reinforcement Learning , 2019, ICML.
[30] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[31] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[32] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..
[33] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[34] Yinyu Ye,et al. The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate , 2011, Math. Oper. Res..
[35] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[36] Nicolas Le Roux,et al. A Geometric Perspective on Optimal Representations for Reinforcement Learning , 2019, NeurIPS.
[37] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[38] Peter Dayan,et al. Q-learning , 1992, Machine Learning.