Dynamic abstraction in reinforcement learning via clustering
暂无分享,去创建一个
Shie Mannor | Ishai Menache | Uri Klein | Amit Hoze | Shie Mannor | Ishai Menache | Amit Hoze | Uri Klein
[1] Michael R. Anderberg,et al. Cluster Analysis for Applications , 1973 .
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] Anil K. Jain,et al. Algorithms for Clustering Data , 1988 .
[4] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[6] Dorit S. Hochbaum,et al. Approximation Algorithms for NP-Hard Problems , 1996 .
[7] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .
[8] Bruce L. Digney,et al. Learning hierarchical control structures for multiple tasks and changing environments , 1998 .
[9] Jean-Arcady Meyer,et al. Learning Hierarchical Control Structures for Multiple Tasks and Changing Environments , 1998 .
[10] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[11] John J. Grefenstette,et al. Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..
[12] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.
[13] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[14] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[15] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[16] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[17] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[18] Leslie Pack Kaelbling,et al. Approximate Planning in POMDPs with Macro-Actions , 2003, NIPS.
[19] Pierre Geurts,et al. Iteratively Extending Time Horizon Reinforcement Learning , 2003, ECML.
[20] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[21] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.