Temporal abstraction in reinforcement learning
暂无分享,去创建一个
[1] G. Carrier,et al. SINGULAR PERTURBATION METHODS , 1976 .
[2] Earl David Sacerdoti,et al. A Structure for Plans and Behavior , 1977 .
[3] Benjamin Kuipers,et al. Common-Sense Knowledge of Space: Learning from Experience , 1979, IJCAI.
[4] J. Hammersley. SIMULATION AND THE MONTE CARLO METHOD , 1982 .
[5] R. Korf. Learning to solve problems by searching for macro-operators , 1983 .
[6] D. Naidu,et al. Singular Perturbation Analysis of Discrete Control Systems , 1985 .
[7] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[8] Allen Newell,et al. Chunking in Soar , 1986 .
[9] Richard E. Korf,et al. Planning as Search: A Quantitative Approach , 1987, Artif. Intell..
[10] Steven Minton,et al. Learning search control knowledge , 1988 .
[11] D. Naidu. Singular Perturbation Methodology in Control Systems , 1988 .
[12] Glenn A. Iba,et al. A heuristic approach to the discovery of macro-operators , 2004, Machine Learning.
[13] Stephen F. McCormick,et al. Multilevel adaptive methods for partial differential equations , 1989, Frontiers in applied mathematics.
[14] Oren Etzioni,et al. Why PRODIGY/EBL Works , 1990, AAAI.
[15] A. Newell,et al. The Problem of Expensive Chunks and its Solution by Restricting Expressiveness , 1990 .
[16] Lambert E. Wixson,et al. Scaling Reinforcement Learning Techniques via Modularity , 1991, ML.
[17] Roger W. Brockett,et al. Hybrid Models for Motion Control Systems , 1993 .
[18] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[19] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[20] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[21] Benjamin Kuipers,et al. Learning to Explore and Build Maps , 1994, AAAI.
[22] Gerald DeJong,et al. Learning to Plan in Continuous Domains , 1994, Artif. Intell..
[23] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[24] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[25] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[26] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[27] Selahattin Kuru,et al. Qualitative System Identification: Deriving Structure from Behavior , 1996, Artif. Intell..
[28] Paul R. Cohen,et al. Searching for Planning Operators with Context-Dependent and Probabilistic Effects , 1996, AAAI/IAAI, Vol. 1.
[29] Gerald DeJong,et al. A Statistical Approach to Adaptive Problem Solving , 1996, Artif. Intell..
[30] Jonas Karlsson,et al. Learning to Solve Multiple Goals , 1997 .
[31] Ronen I. Brafman,et al. Modeling Agents as Qualitative Decision Makers , 1997, Artif. Intell..
[32] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[33] Roderic A. Grupen,et al. A feedback control structure for on-line learning tasks , 1997, Robotics Auton. Syst..
[34] E. B. Baum,et al. Manifesto for an evolutionary economics of intelligence , 1998 .
[35] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[36] Mark D. Pendrith,et al. RL-TOPS: An Architecture for Modularity and Re-Use in Reinforcement Learning , 1998, ICML.
[37] Manuela M. Veloso,et al. Team-Partitioned, Opaque-Transition Reinforced Learning , 1998, RoboCup.