Approximate Policy Iteration with a Policy Language Bias
暂无分享,去创建一个
Robert Givan | Alan Fern | Sung Wook Yoon | Alan Fern | S. Yoon | R. Givan
[1] J. Davenport. Editor , 1960 .
[2] R. Bellman. Dynamic programming. , 1957, Science.
[3] J. Christen. The airplane. , 1985, Occupational therapy in health care.
[4] Oren Etzioni,et al. Explanation-Based Learning: A Problem Solving Perspective , 1989, Artif. Intell..
[5] Balas K. Natarajan,et al. On learning from exercises , 1989, COLT '89.
[6] Steven Minton,et al. Quantitative Results Concerning the Utility of Explanation-based Learning , 1988, Artif. Intell..
[7] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[8] Robert Givan,et al. Taxonomic syntax for first order inference , 1989, JACM.
[9] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[10] Steven Minton,et al. Machine Learning Methods for Planning , 1994 .
[11] Eugene Fink,et al. Integrating planning and learning: the PRODIGY architecture , 1995, J. Exp. Theor. Artif. Intell..
[12] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[13] Anders R. Kristensen,et al. Dynamic programming and Markov decision processes , 1996 .
[14] Tara A. Estlin,et al. Multi-Strategy Learning of Search Control for Partial-Order Planning , 1996, AAAI/IAAI, Vol. 1.
[15] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[16] Craig Boutilier,et al. Approximate Value Trees in Structured Dynamic Programming , 1996, ICML.
[17] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[18] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[19] Prasad Tadepalli,et al. Learning Goal-Decomposition Rules Using Exercises , 1997, AAAI/IAAI.
[20] Shaul Markovitch,et al. A Selective Macro-learning Algorithm and its Application to the NxN Sliding-Tile Puzzle , 1998, J. Artif. Intell. Res..
[21] Luc De Raedt,et al. Relational Reinforcement Learning , 1998, ILP.
[22] Roni Khardon,et al. Learning Action Strategies for Planning Domains , 1999, Artif. Intell..
[23] Bart Selman,et al. Learning Declarative Control Rules for Constraint-BAsed Planning , 2000, ICML.
[24] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..
[25] Hector Geffner,et al. Learning Generalized Policies in Planning Using Concept Languages , 2000, KR.
[26] Fahiem Bacchus,et al. Using temporal logics to express search control knowledge for planning , 2000, Artif. Intell..
[27] Craig A. Knoblock,et al. Learning Plan Rewriting Rules , 2000, AIPS.
[28] Gang Wu,et al. Congestion control via online sampling , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).
[29] Carlos Guestrin,et al. Max-norm Projections for Factored MDPs , 2001, IJCAI.
[30] Bernhard Nebel,et al. The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..
[31] Fahiem Bacchus,et al. The AIPS '00 Planning Competition , 2001, The AI Magazine.
[32] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[33] Xin Wang,et al. Batch Value Function Approximation via Support Vectors , 2001, NIPS.
[34] Fahiem Bacchus,et al. AIPS 2000 Planning Competition: The Fifth International Conference on Artificial Intelligence Planning and Scheduling Systems , 2001 .
[35] Pedro Isasi Viñuela,et al. Using genetic programming to learn and improve control knowledge , 2002, Artif. Intell..
[36] Robert Givan,et al. Inductive Policy Selection for First-Order MDPs , 2002, UAI.
[37] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[38] Terry L. Zimmerman,et al. Learning-Assisted Automated Planning: Looking Back, Taking Stock, Going Forward , 2003, AI Mag..
[39] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[40] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[41] Håkan L. S. Younes. Extending PDDL to Model Stochastic Decision Processes , 2003 .
[42] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[43] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
[44] Andrew G. Barto,et al. Building a Basic Block Instruction Scheduler with Reinforcement Learning and Rollouts , 2002, Machine Learning.
[45] Benjamin Van Roy,et al. Solitaire: Man Versus Machine , 2004, NIPS.
[46] Luc De Raedt,et al. Bellman goes relational , 2004, ICML.
[47] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[48] Roni Khardon,et al. Learning to Take Actions , 1996, Machine Learning.
[49] Sylvie Thiébaux,et al. Exploiting First-Order Regression in Inductive Policy Selection , 2004, UAI.
[50] Jörg Hoffmann,et al. Ordered Landmarks in Planning , 2004, J. Artif. Intell. Res..
[51] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[52] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[53] R. Rivest. Learning Decision Lists , 1987, Machine Learning.
[54] De,et al. Relational Reinforcement Learning , 2022 .