Decision Tree Methods for Finding Reusable MDP Homomorphisms
暂无分享,去创建一个
[1] D. Ballard,et al. Learning to Perceive and Act by Trial and Error , 2005, Machine Learning.
[2] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[3] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.
[6] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..
[7] Andrew G. Barto,et al. Automated State Abstraction for Options using the U-Tree Algorithm , 2000, NIPS.
[8] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[9] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[10] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[11] Balaraman Ravindran,et al. SMDP Homomorphisms: An Algebraic Approach to Abstraction in Semi-Markov Decision Processes , 2003, IJCAI.
[12] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[13] Luc De Raedt,et al. Logical Markov Decision Programs , 2003 .
[14] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .
[15] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[16] Robert Givan,et al. Feature-Discovering Approximate Value Iteration Methods , 2005, SARA.
[17] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.
[18] Justus H. Piater,et al. Interactive learning of mappings from visual percepts to actions , 2005, ICML.
[19] Andrew G. Barto,et al. A causal approach to hierarchical decomposition of factored MDPs , 2005, ICML.