暂无分享,去创建一个
[1] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[2] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[3] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[4] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[5] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[6] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[7] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[12] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[13] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[14] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[15] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[16] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[17] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[18] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[19] James Zou,et al. The Effects of Memory Replay in Reinforcement Learning , 2017, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[20] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[21] Vitaly Levdik,et al. Time Limits in Reinforcement Learning , 2017, ICML.