Compositional Transfer in Hierarchical Reinforcement Learning
暂无分享,去创建一个
Thomas Lampe | Abbas Abdolmaleki | Jost Tobias Springenberg | Nicolas Heess | Martin Riedmiller | Markus Wulfmeier | Roland Hafner | Michael Neunert | Tim Hertweck | Noah Siegel
[1] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[2] C. Bishop. Mixture density networks , 1994 .
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[5] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[6] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[7] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[8] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[9] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[10] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[11] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[12] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[13] Joelle Pineau,et al. An Inference-Based Policy Gradient Method for Learning Options , 2018, ICML.
[14] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[15] Shimon Whiteson,et al. Multitask Soft Option Learning , 2019, UAI.
[16] Pieter Abbeel,et al. Mutual Alignment Transfer Learning , 2017, CoRL.
[17] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[18] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[19] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[20] Sergio Gomez Colmenarejo,et al. TF-Replicator: Distributed Machine Learning for Researchers , 2019, ArXiv.
[21] Thomas G. Dietterich,et al. To transfer or not to transfer , 2005, NIPS 2005.
[22] Emilio Soria Olivas,et al. Handbook of Research on Machine Learning Applications and Trends : Algorithms , Methods , and Techniques , 2009 .
[23] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[24] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[25] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[26] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[27] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[28] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[29] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[30] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[31] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[32] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[33] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[34] Jaime G. Carbonell,et al. Characterizing and Avoiding Negative Transfer , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[36] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[37] Michael Figurnov,et al. Monte Carlo Gradient Estimation in Machine Learning , 2019, J. Mach. Learn. Res..
[38] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[39] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[40] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[41] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[42] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[43] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[44] Doina Precup,et al. The Termination Critic , 2019, AISTATS.
[45] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[46] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[47] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[48] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[49] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[50] John R. Anderson,et al. The Transfer of Cognitive Skill , 1989 .