Fast Online Q(λ)
暂无分享,去创建一个
[1] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] Teuvo Kohonen,et al. Self-Organization and Associative Memory , 1988 .
[4] C. Watkins. Learning from delayed rewards , 1989 .
[5] S. Thrun. Eecient Exploration in Reinforcement Learning , 1992 .
[6] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[7] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[8] Steven Douglas Whitehead,et al. Reinforcement learning for the adaptive control of perception and action , 1992 .
[9] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[10] Bernd Fritzke. Supervised Learning with Growing Cell Structures , 1993, NIPS.
[11] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[12] Pawel Cichosz,et al. Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning , 1994, J. Artif. Intell. Res..
[13] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[14] Pawea Cichosz. Truncating Temporal Diierences: on the Eecient Implementation of Td() for Reinforcement Learning , 1995 .
[15] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[16] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[17] Jürgen Schmidhuber,et al. Speeding up Q(lambda)-Learning , 1998, ECML.
[18] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[19] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[20] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[21] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[22] R. Simmons,et al. The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms , 2004, Machine Learning.
[23] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[24] Reid G. Simmons,et al. The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms , 2005, Machine Learning.
[25] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.