Strategic best-response learning in multiagent systems
暂无分享,去创建一个
[1] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[2] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[3] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[5] J. Albus. A Theory of Cerebellar Function , 1971 .
[6] Ronitt Rubinfeld,et al. Efficient algorithms for learning to play repeated games against computationally bounded adversaries , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[7] Lance Fortnow,et al. Optimality and domination in repeated games with bounded players , 1993, STOC '94.
[8] Claude-Nicolas Fiechter. Expected Mistake Bound Model for On-Line Reinforcement Learning , 1997, ICML.
[9] Yoav Shoham,et al. Polynomial-time reinforcement learning of near-optimal policies , 2002, AAAI/IAAI.
[10] Bikramjit Banerjee,et al. Efficient No-Regret Multiagent Learning , 2005, AAAI.
[11] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[12] W. Hamilton,et al. The evolution of cooperation. , 1984, Science.
[13] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[16] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[17] David Carmel,et al. How to explore your opponent's strategy (almost) optimally , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).
[18] David Carmel,et al. Model-based learning of interaction strategies in multi-agent systems , 1998, J. Exp. Theor. Artif. Intell..
[19] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[20] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[21] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[22] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[23] Yoav Shoham,et al. Learning against opponents with bounded memory , 2005, IJCAI.
[24] David Carmel,et al. Learning Models of Intelligent Agents , 1996, AAAI/IAAI, Vol. 1.
[25] Bikramjit Banerjee,et al. Performance Bounded Reinforcement Learning in Strategic Interactions , 2004, AAAI.
[26] Bikramjit Banerjee,et al. Efficient learning of multi-step best response , 2005, AAMAS '05.
[27] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[28] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[29] Claude-Nicolas Fiechter,et al. Efficient reinforcement learning , 1994, COLT '94.