Beyond Tabula Rasa: Reincarnating Reinforcement Learning
暂无分享,去创建一个
[1] Mark Rowland,et al. Understanding and Preventing Capacity Loss in Reinforcement Learning , 2022, ICLR.
[2] Tom B. Brown,et al. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , 2022, ArXiv.
[3] S. Levine,et al. Jump-Start Reinforcement Learning , 2022, ICML.
[4] A data-driven approach for learning to control computers , 2022, 2202.08137.
[5] Sergey Levine,et al. Offline Reinforcement Learning with Implicit Q-Learning , 2021, ICLR.
[6] A Survey of Generalisation in Deep Reinforcement Learning , 2021, ArXiv.
[7] Sergey Levine,et al. AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale , 2021, CoRL.
[8] Georg Ostrovski,et al. The Difficulty of Passive Learning in Deep Reinforcement Learning , 2021, NeurIPS.
[9] Marc G. Bellemare,et al. Deep Reinforcement Learning at the Edge of the Statistical Precipice , 2021, NeurIPS.
[10] Pieter Abbeel,et al. Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble , 2021, CoRL.
[11] Andrey Kolobov,et al. Heuristic-Guided Reinforcement Learning , 2021, NeurIPS.
[12] Quoc V. Le,et al. A graph placement methodology for fast chip design , 2021, Nature.
[13] Alexey Skrynnik,et al. Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations , 2021, Knowl. Based Syst..
[14] Charles Blundell,et al. Beyond Fine-Tuning: Transferring Behavior in Reinforcement Learning , 2021, 2102.13515.
[15] Krzysztof Choromanski,et al. MLGO: a Machine Learning Guided Compiler Optimizations Framework , 2021, ArXiv.
[16] Johan S. Obando-Ceron,et al. Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research , 2020, ICML.
[17] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[18] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[19] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, ArXiv.
[20] Sameera S. Ponda,et al. Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.
[21] Yoshua Bengio,et al. Revisiting Fundamentals of Experience Replay , 2020, ICML.
[22] Geoffrey E. Hinton,et al. Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.
[23] S. Levine,et al. Accelerating Online Reinforcement Learning with Offline Datasets , 2020, ArXiv.
[24] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[25] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[26] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[27] S. Levine,et al. Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning , 2020, CoRL.
[28] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[29] Rishabh Agarwal,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2019, ICML.
[30] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[31] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[32] Peter Stone,et al. Agents teaching agents: a survey on inter-agent transfer learning , 2019, Autonomous Agents and Multi-Agent Systems.
[33] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[34] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[35] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.
[36] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[37] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[38] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[39] Sen Wang,et al. Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[40] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[41] Mehmet Remzi Dogar,et al. Planning with a Receding Horizon for Manipulation in Clutter Using a Learned Value Function , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).
[42] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[43] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[44] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[45] Byron Boots,et al. Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning , 2018, ICLR.
[46] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[47] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[48] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[49] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[50] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[51] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[52] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[53] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[54] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[55] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[57] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[58] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[59] Tianqi Chen,et al. Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.
[60] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[61] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[62] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[63] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[64] John Langford,et al. Learning to Search Better than Your Teacher , 2015, ICML.
[65] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[66] Matthew E. Taylor,et al. Teaching on a budget: agents advising agents in reinforcement learning , 2013, AAMAS.
[67] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[68] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[69] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[70] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[71] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[72] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[73] Reinaldo A. C. Bianchi,et al. Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning , 2004, SBIA.
[74] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[75] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[76] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[77] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[78] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .