Asynchronous Stochastic Approximation and Q-Learning
暂无分享,去创建一个
[1] V. Nollau. Kushner, H. J./Clark, D. S., Stochastic Approximation Methods for Constrained and Unconstrained Systems. (Applied Mathematical Sciences 26). Berlin‐Heidelberg‐New York, Springer‐Verlag 1978. X, 261 S., 4 Abb., DM 26,40. US $ 13.20 , 1980 .
[2] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.
[3] Tamer Basar,et al. Asymptotic agreement and convergence of asynchronous stochastic algorithms , 1986, 1986 25th IEEE Conference on Decision and Control.
[4] H. Kushner,et al. Asymptotic properties of distributed and communication stochastic approximation algorithms , 1987 .
[5] H. Kushner,et al. Stochastic approximation algorithms for parallel and distributed processing , 1987 .
[6] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[7] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[8] Andrew W. Moore,et al. Memory-based Reinforcement Learning: Converging with Less Data and Less Real Time , 1993 .
[9] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[10] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[11] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.
[12] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[13] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[14] J. Walrand,et al. Distributed Dynamic Programming , 2022 .