Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time
暂无分享,去创建一个
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] G. Siouris,et al. Optimum systems control , 1979, Proceedings of the IEEE.
[3] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[4] David L. Waltz,et al. Toward memory-based reasoning , 1986, CACM.
[5] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[6] MITSUO SATO,et al. Learning control of finite Markov chains with an explicit trade-off between estimation and control , 1988, IEEE Trans. Syst. Man Cybern..
[7] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[8] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[9] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[10] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[11] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .
[12] Alan D. Christiansen,et al. Learning reliable manipulation strategies without initial physical models , 1990, Proceedings., IEEE International Conference on Robotics and Automation.
[13] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[14] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[15] Lawrence Birnbaum,et al. Machine learning : proceedings of the Eighth International Workshop (ML91) , 1991 .
[16] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[17] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[18] Satinder P. Singh,et al. Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.
[19] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[20] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .