暂无分享,去创建一个
[1] Satinder P. Singh,et al. Linear options , 2010, AAMAS.
[2] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[3] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[4] Wilco Moerman,et al. Hierarchical Reinforcement Learning: Assignment of Behaviours to Subpolicies by Self-Organization , 2009 .
[5] Doina Precup,et al. Optimal policy switching algorithms for reinforcement learning , 2010, AAMAS.
[6] V. Lesser,et al. Accelerating Multi-agent Reinforcement Learning with Dynamic Co-learning , 2015 .
[7] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[8] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[9] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[10] Shie Mannor,et al. Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.
[11] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[13] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[14] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[15] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[16] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[17] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[18] Andrew G. Barto,et al. Behavioral Hierarchy: Exploration and Representation , 2013, Computational and Robotic Models of the Hierarchical Organization of Behavior.
[19] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[20] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[21] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[22] Nicholas Roy,et al. Efficient Planning under Uncertainty with Macro-actions , 2014, J. Artif. Intell. Res..
[23] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[24] Shie Mannor,et al. Time-regularized interrupting options , 2014, ICML 2014.
[25] Douglas Summers-Stay,et al. Using a minimal action grammar for activity understanding in the real world , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.