暂无分享,去创建一个
Jost Tobias Springenberg | Martin Riedmiller | Markus Wulfmeier | Roland Hafner | Tim Hertweck | Nicolas Heess | Michael Bloesch | Noah Siegel
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Emilio Soria Olivas,et al. Handbook of Research on Machine Learning Applications and Trends : Algorithms , Methods , and Techniques , 2009 .
[3] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[4] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[5] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[6] Joelle Pineau,et al. Independently Controllable Features , 2017 .
[7] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[8] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[9] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[10] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[11] Georg Martius,et al. Control What You Can: Intrinsically Motivated Task-Planning Agent , 2019, NeurIPS.
[12] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[13] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[14] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[15] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[17] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[18] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[19] Jürgen Schmidhuber,et al. Learning skills from play: Artificial curiosity on a Katana robot arm , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[20] Martin A. Riedmiller,et al. Speeding-up Reinforcement Learning with Multi-step Actions , 2002, ICANN.
[21] Gavriel Salomon,et al. T RANSFER OF LEARNING , 1992 .
[22] Thomas Lampe,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, RSS 2020.
[23] Murilo F. Martins,et al. Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup , 2019, Robotics: Science and Systems.
[24] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[25] Chrystopher L. Nehaniv,et al. All Else Being Equal Be Empowered , 2005, ECAL.
[26] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[27] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[28] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[29] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[30] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[31] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[32] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[33] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..
[34] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[35] Martin A. Riedmiller,et al. Reinforcement learning on explicitly specified time scales , 2003, Neural Computing & Applications.
[36] Misha Denil,et al. The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously , 2017, CoRL.
[37] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[38] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[39] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[40] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.
[41] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[42] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[43] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[44] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[45] Raia Hadsell,et al. Disentangled Cumulants Help Successor Representations Transfer to New Tasks , 2019, ArXiv.
[46] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[47] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[48] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..