暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Marc G. Bellemare,et al. Skip Context Tree Switching , 2014, ICML.
[3] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[4] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[5] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[6] Hugo Larochelle,et al. Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction , 2019, AAAI.
[7] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[8] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[9] Marlos C. Machado,et al. Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment , 2019, ArXiv.
[10] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.
[11] Conrad D. James,et al. Neurogenesis deep learning: Extending deep networks to accommodate new classes , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[14] Honglak Lee,et al. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.
[15] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[16] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[17] R Ratcliff,et al. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.
[18] Yee Whye Teh,et al. Continual Unsupervised Representation Learning , 2019, NeurIPS.
[19] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[20] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[21] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[22] Razvan Pascanu,et al. Ray Interference: a Source of Plateaus in Deep Reinforcement Learning , 2019, ArXiv.
[23] Joel Veness,et al. The Forget-me-not Process , 2016, NIPS.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] U. Rieder,et al. Markov Decision Processes , 2010 .
[27] Tinne Tuytelaars,et al. Task-Free Continual Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[29] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[30] R. Bellman. A Markovian Decision Process , 1957 .
[31] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[32] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[33] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[34] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[35] Qiang Yang,et al. Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.
[36] Sung Ju Hwang,et al. Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.
[37] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[38] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[39] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[40] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[41] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[42] Anthony V. Robins,et al. Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..
[43] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[44] Marlos C. Machado,et al. Generalization and Regularization in DQN , 2018, ArXiv.
[45] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[46] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[47] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).