暂无分享,去创建一个
Patrick M. Pilarski | Richard S. Sutton | Harm van Seijen | Ashique Rupam Mahmood | R. Sutton | P. Pilarski | A. Mahmood | H. V. Seijen
[1] B Hudgins,et al. Myoelectric signal processing for control of powered limb prostheses. , 2006, Journal of electromyography and kinesiology : official journal of the International Society of Electrophysiological Kinesiology.
[2] Patrick M. Pilarski,et al. Adaptive artificial limbs: a real-time approach to prediction and anticipation , 2013, IEEE Robotics & Automation Magazine.
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[5] P. Dayan. The Convergence of TD(λ) for General λ , 1992, Machine Learning.
[6] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[7] Richard S. Sutton,et al. True online TD(λ) , 2014, ICML 2014.
[8] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[9] Richard S. Sutton,et al. Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..
[10] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.