暂无分享,去创建一个
Dushyant Rao | Abbas Abdolmaleki | Nicolas Heess | Martin Riedmiller | Markus Wulfmeier | Roland Hafner | Michael Neunert | Tim Hertweck | Dhruva Tirumala | Noah Siegel | Thomas Lampe
[1] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[2] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[3] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[4] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[5] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[6] Alejandro Agostini,et al. Reinforcement Learning with a Gaussian mixture model , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[7] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[8] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[9] Doina Precup,et al. Off-policy Learning with Options and Recognizers , 2005, NIPS.
[10] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[11] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[12] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[13] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[14] Doina Precup,et al. The Termination Critic , 2019, AISTATS.
[15] Joelle Pineau,et al. An Inference-Based Policy Gradient Method for Learning Options , 2018, ICML.
[16] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Martin A. Riedmiller,et al. Regularized Hierarchical Policies for Compositional Transfer in Robotics , 2019, ArXiv.
[19] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[20] Shimon Whiteson,et al. DAC: The Double Actor-Critic Architecture for Learning Options , 2019, NeurIPS.
[21] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[22] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[23] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[24] C. Bishop. Mixture density networks , 1994 .
[25] Pieter Abbeel,et al. Sub-policy Adaptation for Hierarchical Reinforcement Learning , 2019, ICLR.
[26] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[27] Misha Denil,et al. The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously , 2017, CoRL.
[28] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[29] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[30] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[31] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[32] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[33] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[34] Marcin Andrychowicz,et al. Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.
[35] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.
[36] Shimon Whiteson,et al. Multitask Soft Option Learning , 2019, UAI.
[37] Ion Stoica,et al. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.
[38] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[39] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[40] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[41] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[42] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[43] Thomas G. Dietterich,et al. To transfer or not to transfer , 2005, NIPS 2005.
[44] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[45] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[46] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.