论文信息 - Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining - 字舞流文

Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining

We introduce a skill discovery method for reinforcement learning in continuous domains that constructs chains of skills leading to an end-of-task reward. We demonstrate experimentally that it creates appropriate skills and achieves performance benefits in a challenging continuous domain.

Andrew G. Barto | George Konidaris | A. Barto | G. Konidaris

[1] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.

[2] Stewart W. Wilson,et al. From Animals to Animats 5. Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 1997 .

[3] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.

[4] Jean-Arcady Meyer,et al. Learning Hierarchical Control Structures for Multiple Tasks and Changing Environments , 1998 .

[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[6] Daniel E. Koditschek,et al. Sequential Composition of Dynamically Dexterous Robot Behaviors , 1999, Int. J. Robotics Res..

[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[8] Doina Precup,et al. Using Options for Knowledge Transfer in Reinforcement Learning , 1999 .

[9] Daniel S. Bernstein,et al. Reusing Old Policies to Accelerate Learning on New MDPs , 1999 .

[10] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[11] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[12] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.

[13] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.

[14] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[15] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[16] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.

[17] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[18] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.

[19] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.

[20] Andrew G. Barto,et al. A causal approach to hierarchical decomposition of factored MDPs , 2005, ICML.

[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.

[23] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.

[24] G. Konidaris,et al. Sensorimotor abstraction selection for efficient, autonomous robot skill acquisition , 2008, 2008 7th IEEE International Conference on Development and Learning.

[25] Andrew G. Barto,et al. Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[26] Sridhar Mahadevan,et al. Learning Representation and Control in Markov Decision Processes: New Frontiers , 2009, Found. Trends Mach. Learn..

[27] Jan Peters,et al. Learning complex motions by sequencing simpler motion templates , 2009, ICML '09.

[28] Russ Tedrake,et al. LQR-trees: Feedback motion planning on sparse randomized trees , 2009, Robotics: Science and Systems.

[29] Autonomously Learning an Action Hierarchy Using a Learned Qualitative State Representation , 2009, IJCAI.

[30] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.