PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning
暂无分享,去创建一个
[1] John D. C. Little,et al. The Use of Storage Water in a Hydroelectric System , 1955, Oper. Res..
[2] Saul Amarel,et al. On representations of problems of reasoning about actions , 1968 .
[3] H A Simon,et al. The theory of learning by doing. , 1979, Psychological review.
[4] C. Watkins. Learning from delayed rewards , 1989 .
[5] Glenn A. Iba,et al. A heuristic approach to the discovery of macro-operators , 2004, Machine Learning.
[6] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[7] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .
[8] Michael H. Bowling,et al. Reusing Learned Policies Between Similar Problems , 1998 .
[9] Bruce L. Digney,et al. Learning hierarchical control structures for multiple tasks and changing environments , 1998 .
[10] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[11] Thomas G. Dietterich. State Abstraction in MAXQ Hierarchical Reinforcement Learning , 1999, NIPS.
[12] Daniel S. Bernstein,et al. Reusing Old Policies to Accelerate Learning on New MDPs , 1999 .
[13] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[14] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[15] Amy McGovern. Autonomous Discovery of Abstractions through Interaction with an Environment , 2002, SARA.