On-policy concurrent reinforcement learning
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. Fast Online Q(λ) , 1998, Machine Learning.
[2] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[3] Bikramjit Banerjee,et al. Fast Concurrent Reinforcement Learners , 2001, IJCAI.
[4] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[5] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[6] J. Nash. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.
[7] Eric van Damme,et al. Non-Cooperative Games , 2000 .
[8] Gavin Adrian Rummery. Problem solving with reinforcement learning , 1995 .
[9] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[10] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[11] Geoffrey J. Gordon. Reinforcement Learning with Function Approximation Converges to a Region , 2000, NIPS.
[12] O. Mangasarian,et al. Two-person nonzero-sum games and quadratic programming , 1964 .
[13] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[14] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.
[15] Tuomas Sandholm,et al. On Multiagent Q-Learning in a Semi-Competitive Domain , 1995, Adaption and Learning in Multi-Agent Systems.
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[18] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[19] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[20] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[21] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[22] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[23] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..