论文信息 - Reinforcement learning for stochastic cooperative multi-agent-systems

Reinforcement learning for stochastic cooperative multi-agent-systems

We present a distributed variant of Q-learning that allows to learn the optimal cost-to-go function in stochastic cooperative multi-agent domains without communication between the agents.

Martin Lauer | Martin A. Riedmiller | M. Lauer

[1] C. Watkins. Learning from delayed rewards , 1989 .

[2] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[4] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[5] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[6] Andrew W. Moore,et al. Distributed Value Functions , 1999, ICML.

[7] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[8] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.