MO2: Model-Based Offline Options
暂无分享,去创建一个
Martin A. Riedmiller | R. Hadsell | N. Heess | Dhruva Tirumala | Markus Wulfmeier | Dushyant Rao | Sasha Salter
[1] Ingmar Posner,et al. Priors, Hierarchy, and Information Asymmetry for Skill Transfer in Reinforcement Learning , 2022 .
[2] Doina Precup,et al. Flexible Option Learning , 2021, NeurIPS.
[3] Sergey Levine,et al. Parrot: Data-Driven Behavioral Priors for Reinforcement Learning , 2020, ICLR.
[4] S. Levine,et al. OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning , 2020, ICLR.
[5] Martin A. Riedmiller,et al. Data-efficient Hindsight Off-policy Option Learning , 2020, ICML.
[6] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, ArXiv.
[7] Doina Precup,et al. Diversity-Enriched Option-Critic , 2020, ArXiv.
[8] Joseph J. Lim,et al. Accelerating Reinforcement Learning with Learned Skill Priors , 2020, CoRL.
[9] Marlos C. Machado,et al. Exploration in Reinforcement Learning with Deep Covering Options , 2020, ICLR.
[10] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[11] Oleg O. Sushkov,et al. Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.
[12] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.
[13] Qiang Xu,et al. nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Martin A. Riedmiller,et al. Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics , 2020, CoRL.
[15] N. Heess,et al. Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks , 2019 .
[16] S. Levine,et al. RoboNet: Large-Scale Multi-Robot Learning , 2019, CoRL.
[17] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[18] Sebastian Scherer,et al. Learning Temporal Abstraction with Information-theoretic Constraints for Hierarchical Reinforcement Learning , 2019 .
[19] Balaraman Ravindran,et al. Successor Options: An Option Discovery Framework for Reinforcement Learning , 2019, IJCAI.
[20] Shimon Whiteson,et al. DAC: The Double Actor-Critic Architecture for Learning Options , 2019, NeurIPS.
[21] Doina Precup,et al. The Termination Critic , 2019, AISTATS.
[22] Sergey Levine,et al. InfoBot: Transfer and Exploration via the Information Bottleneck , 2019, ICLR.
[23] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[24] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.
[25] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[26] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[27] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[28] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.
[29] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.
[30] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[31] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[32] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[33] Zhe L. Lin,et al. The AdobeIndoorNav Dataset: Towards Deep Reinforcement Learning based Real-world Indoor Robot Visual Navigation , 2018, ArXiv.
[34] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[35] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[36] Joelle Pineau,et al. An Inference-Based Policy Gradient Method for Learning Options , 2018, ICML.
[37] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[38] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[39] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[40] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[41] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[42] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[43] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[44] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[45] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[46] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[47] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[48] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[49] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[50] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[51] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[52] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..