Navigating Through Temporal Difference
暂无分享,去创建一个
[1] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .
[2] L. Nadel,et al. The Hippocampus as a Cognitive Map , 1978 .
[3] R. Passingham. The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.
[4] C. Watkins. Learning from delayed rewards , 1989 .
[5] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[6] Michael C. Mozer,et al. Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.
[7] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.