论文信息 - TD Models: Modeling the World at a Mixture of Time Scales - 字舞流文

TD Models: Modeling the World at a Mixture of Time Scales

Richard S. Sutton | R. Sutton

[1] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[2] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.

[3] Leslie Pack Kaelbling,et al. Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..

[4] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[5] Eric A. Hansen,et al. Cost-Effective Sensing during Plan Execution , 1994, AAAI.

[6] L. Chrisman. Reasoning About Probabilistic Actions At Multiple Levels of Granularity , 1994 .

[7] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.

[8] Jonas Karlsson,et al. Learning via task decomposition , 1993 .

[9] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.

[10] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.

[11] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..

[12] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[13] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.

[14] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.

[15] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[16] Lambert E. Wixson,et al. Scaling Reinforcement Learning Techniques via Modularity , 1991, ML.

[17] Mark B. Ring. Incremental Development of Complex Behaviors , 1991, ML.

[18] Gary L. Drescher,et al. Made-up minds - a constructivist approach to artificial intelligence , 1991 .

[19] J. Urgen Schmidhuber. Neural Sequence Chunkers , 1991 .

[20] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[21] Maja J Matarić. A model for distributed mobile robot environment learning and navigation , 1990 .

[22] R. Korf. Learning to solve problems by searching for macro-operators , 1983 .

[23] Benjamin Kuipers,et al. Common-Sense Knowledge of Space: Learning from Experience , 1979, IJCAI.

[24] Earl David Sacerdoti,et al. A Structure for Plans and Behavior , 1977 .

[25] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..