Improving Generalization for Temporal Difference Learning: The Successor Representation
暂无分享,去创建一个
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] Peter Secretan. Learning , 1965, Mental Health.
[3] K. Abromeit. Music Received , 2023, Notes.
[4] A. L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[5] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .
[6] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[7] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[8] Stephen M. Omohundro,et al. Efficient Algorithms with Neural Network Behavior , 1987, Complex Syst..
[9] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[10] Peter Dayan,et al. Navigating Through Temporal Difference , 1990, NIPS.
[11] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[12] Paul J. Werbos,et al. Consistency of HDP applied to a simple reinforcement learning problem , 1990, Neural Networks.
[13] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[14] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[15] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[16] P. Dayan. Reinforcing connectionism : learning the statistical way , 1991 .
[17] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..