Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models

[1]  Satinder P. Singh,et al.  Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.

[2]  Richard Yee,et al.  Abstraction in Control Learning , 1992 .

[3]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[4]  Satinder Singh The Ecient Learning of Multiple Task Sequences , 1992 .

[5]  Vijaykumar Gullapalli,et al.  Reinforcement learning and its application to control , 1992 .

[6]  Satinder P. Singh,et al.  The Efficient Learning of Multiple Task Sequences , 1991, NIPS.

[7]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[8]  Leslie Pack Kaelbling,et al.  Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[9]  Paul E. Utgoff,et al.  Two Kinds of Training Information For Evaluation Function Learning , 1991, AAAI.

[10]  Steven D. Whitehead,et al.  Complexity and Cooperation in Q-Learning , 1991, ML.

[11]  Long Ji Lin,et al.  Self-improvement Based on Reinforcement Learning, Planning and Teaching , 1991, ML.

[12]  A. Moore Variable Resolution Dynamic Programming , 1991, ML.

[13]  Andrew G. Barto,et al.  On the Computational Economics of Reinforcement Learning , 1991 .

[14]  L. Kaelbling Learning in embedded systems , 1993 .

[15]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[16]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[17]  Herbert A. Simon,et al.  Prediction and Prescription in Systems Modeling , 1990, Oper. Res..

[18]  Paul J. Werbos,et al.  Consistency of HDP applied to a simple reinforcement learning problem , 1990, Neural Networks.

[19]  Richard S. Sutton,et al.  Sequential Decision Problems and Neural Networks , 1989, NIPS 1989.

[20]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[21]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[22]  R. Korf Learning to solve problems by searching for macro-operators , 1983 .

[23]  D. C. Cooper,et al.  Sequential Machines and Automata Theory , 1968, Comput. J..

[24]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..