暂无分享,去创建一个
Filipe Wall Mutz | Paulo Rauber | Filipe Mutz | Juergen Schmidhuber | J. Schmidhuber | Paulo Rauber | Paulo E. Rauber
[1] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[2] Jürgen Schmidhuber,et al. Learning to generate sub-goals for action sequences , 1991 .
[3] Jürgen Schmidhuber,et al. Learning to Generate Artificial Fovea Trajectories for Target Detection , 1991, Int. J. Neural Syst..
[4] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[5] Pranab Kumar Sen,et al. Large Sample Methods in Statistics: An Introduction with Applications , 1993 .
[6] Jonas Karlsson,et al. Learning via task decomposition , 1993 .
[7] Juergen Schmidhuber,et al. On learning how to learn learning strategies , 1994 .
[8] Gerhard Weiß,et al. Hierarchical Chunking in Classifier Systems , 1994, AAAI.
[9] Roland Olsson,et al. Inductive Functional Programming Using Incremental Program Transformation , 1995, Artif. Intell..
[10] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[11] Risto Miikkulainen,et al. Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..
[12] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[13] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[14] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[15] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[16] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[17] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[18] Leonid Peshkin,et al. Learning from Scarce Experience , 2002, ICML.
[19] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[20] Risto Miikkulainen,et al. Evolving Keepaway Soccer Players through Task Decomposition , 2003, GECCO.
[21] Sridhar Mahadevan,et al. Hierarchical Policy Gradient Algorithms , 2003, ICML.
[22] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[23] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[24] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[25] Jürgen Schmidhuber,et al. Optimal Ordered Problem Solver , 2002, Machine Learning.
[26] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[28] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[29] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[30] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[31] Pieter Abbeel,et al. On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient , 2010, NIPS.
[32] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[33] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[34] Tom Schaul,et al. The two-dimensional organization of behavior , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).
[35] Jan Peters,et al. Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .
[36] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[37] Jürgen Schmidhuber,et al. First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.
[38] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[39] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[40] Alexander Fabisch,et al. Active contextual policy search , 2014, J. Mach. Learn. Res..
[41] Peter Englert,et al. Multi-task policy search for robotics , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[42] Philip S. Thomas,et al. High Confidence Policy Improvement , 2015, ICML.
[43] J. H. Metzen,et al. Bayesian Optimization for Contextual Policy Search * , 2015 .
[44] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[45] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[46] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[47] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .
[48] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[49] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[50] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[51] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[52] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[53] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[54] Honglak Lee,et al. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.
[55] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[56] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[57] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[58] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[59] Kate Saenko,et al. Hierarchical Actor-Critic , 2017, ArXiv.
[60] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[61] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[62] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[63] Jitendra Malik,et al. Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[64] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[65] Tom Schaul,et al. Unicorn: Continual Learning with a Universal, Off-policy Agent , 2018, ArXiv.
[66] David Hsu,et al. Factored Contextual Policy Search with Bayesian optimization , 2016, 2019 International Conference on Robotics and Automation (ICRA).
[67] J. Schmidhuber,et al. Learning to Generate Focus Trajectories for Attentive Vision , 2019 .