Multi-Agent Systems by Incremental Gradient Reinforcement Learning

A new reinforcement learning (RL) methodology is proposed to design multi-agent systems. In the realistic setting of situated agents with local perception, the task of automatically building a coordinated system is of crucial importance. We use simple reactive agents which learn their own behavior in a decentralized way. To cope with the difficulties inherent to RL used in that framework, we have developed an incremental learning algorithm where agents face more and more complex tasks. We illustrate this general framework on a computer experiment where agents have to coordinate to reach a global goal.

[1]  Michael I. Jordan,et al.  Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.

[2]  David Carmel,et al.  Opponent Modeling in Multi-Agent Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[3]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[4]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[5]  Edmund H. Durfee,et al.  Agents Learning about Agents: A Framework and Analysis , 1997 .

[6]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[7]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[8]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[9]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[10]  Jacques Ferber,et al.  Multi-agent systems - an introduction to distributed artificial intelligence , 1999 .

[11]  Kagan Tumer,et al.  General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[12]  P. Bartlett,et al.  Direct Gradient-Based Reinforcement Learning: I. Gradient Estimation Algorithms , 1999 .

[13]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[14]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[15]  Victor R. Lesser,et al.  Communication in multi-agent Markov decision processes , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[16]  Alain Dutech,et al.  Solving POMDPs Using Selected Past Events , 2000, ECAI.

[17]  J. Baxter,et al.  Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[18]  Olivier Buffet,et al.  Incremental reinforcement learning for designing multi-agent systems , 2001, AGENTS '01.

[19]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.