Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
暂无分享,去创建一个
Marc G. Bellemare | Rishabh Agarwal | P. S. Castro | Max Schwarzer | Aaron Courville | Aaron C. Courville
[1] Ross B. Girshick,et al. Mask R-CNN , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[3] S. Levine,et al. Accelerating Online Reinforcement Learning with Offline Datasets , 2020, ArXiv.
[4] Tom B. Brown,et al. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , 2022, ArXiv.
[5] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[6] Mehmet Remzi Dogar,et al. Planning with a Receding Horizon for Manipulation in Clutter Using a Learned Value Function , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).
[7] Quoc V. Le,et al. A graph placement methodology for fast chip design , 2021, Nature.
[8] Mark Rowland,et al. Understanding and Preventing Capacity Loss in Reinforcement Learning , 2022, ICLR.
[9] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[10] Reinaldo A. C. Bianchi,et al. Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning , 2004, SBIA.
[11] Tianqi Chen,et al. Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.
[12] Georg Ostrovski,et al. The Difficulty of Passive Learning in Deep Reinforcement Learning , 2021, NeurIPS.
[13] Sen Wang,et al. Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[14] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[15] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[16] Krzysztof Choromanski,et al. MLGO: a Machine Learning Guided Compiler Optimizations Framework , 2021, ArXiv.
[17] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, ArXiv.
[18] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[19] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[20] Pieter Abbeel,et al. Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble , 2021, CoRL.
[21] Sergey Levine,et al. Offline Reinforcement Learning with Implicit Q-Learning , 2021, ICLR.
[22] Alexey Skrynnik,et al. Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations , 2021, Knowl. Based Syst..
[23] Johan S. Obando-Ceron,et al. Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research , 2020, ICML.
[24] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[25] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[26] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[27] Geoffrey E. Hinton,et al. Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.
[28] Rishabh Agarwal,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2019, ICML.
[29] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[30] Charles Blundell,et al. Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning , 2021, 2102.13515.
[31] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[32] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[33] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[34] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[35] Yoshua Bengio,et al. Revisiting Fundamentals of Experience Replay , 2020, ICML.
[36] Marc G. Bellemare,et al. Deep Reinforcement Learning at the Edge of the Statistical Precipice , 2021, NeurIPS.
[37] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] S. Levine,et al. Jump-Start Reinforcement Learning , 2022, ICML.
[39] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[40] S. Levine,et al. Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning , 2020, CoRL.
[41] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[42] Sergey Levine,et al. AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale , 2021, CoRL.
[43] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[44] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[45] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[46] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[47] Sameera S. Ponda,et al. Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.
[48] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[49] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[50] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[51] Andrey Kolobov,et al. Heuristic-Guided Reinforcement Learning , 2021, NeurIPS.
[52] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[53] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[54] Matthew E. Taylor,et al. Teaching on a budget: agents advising agents in reinforcement learning , 2013, AAMAS.
[55] A Survey of Generalisation in Deep Reinforcement Learning , 2021, ArXiv.
[56] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[57] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[58] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[59] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[60] A data-driven approach for learning to control computers , 2022, 2202.08137.
[61] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[62] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[63] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[64] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[65] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[66] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[67] Byron Boots,et al. Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning , 2018, ICLR.
[68] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[69] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[70] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[71] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[72] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[73] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[74] John Langford,et al. Learning to Search Better than Your Teacher , 2015, ICML.
[75] Peter Stone,et al. Agents teaching agents: a survey on inter-agent transfer learning , 2019, Autonomous Agents and Multi-Agent Systems.
[76] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[77] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[78] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.