相关论文

The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems

Abstract:Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate their action choices in multi agent systems. We examine some of the factors that can influence the dynamics of the learning process in such a setting. We first distinguish reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts. We study (a simple form of) Q-leaming in cooperative multi agent systems under these two perspectives, focusing on the influence of that game structure and exploration strategies on convergence to (optimal and suboptimal) Nash equilibria. We then propose alternative optimistic exploration strategies that increase the likelihood of convergence to an optimal equilibrium.

摘要:强化学习可以为多智能体系统中的智能体学习如何协调他们的动作选择提供一种健壮而自然的方法。我们考察了在这样的环境中可能影响学习过程动态的一些因素。我们首先将那些不知道(或忽视)其他代理的存在的强化学习者与那些明确试图学习联合行动的价值和他们的对手的策略的人区分开来。我们在这两个视角下研究了协作多智能体系统中的Q-学习(一种简单形式),重点研究了博弈结构和探索策略对收敛到(最优和次优)纳什均衡的影响。然后,我们提出了另一种乐观的探索策略,这些策略增加了收敛到最优均衡的可能性。

参考文献

[1]  O. H. Brownlee,et al.  ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .

[2]  David Lewis Convention: A Philosophical Study , 1986 .

[3]  W. Hamilton,et al.  The Evolution of Cooperation , 1984 .

[4]  John C. Harsanyi,et al.  Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .

[5]  Roger B. Myerson,et al.  Game theory - Analysis of Conflict , 1991 .

[6]  David M. Kreps,et al.  Lectures on learning and equilibrium in strategic form games , 1992 .

[7]  Moshe Tennenholtz,et al.  On the Synthesis of Useful Social Laws for Artificial Agent Societies (Preliminary Report) , 1992, AAAI.

[8]  Moshe Tennenholtz,et al.  Emergent Conventions in Multi-Agent Systems: Initial Experimental Results and Observations (Preliminary Report) , 1992, KR.

[9]  H. Young,et al.  The Evolution of Conventions , 1993 .

[10]  John N. Tsitsiklis,et al.  Asynchronous stochastic approximation and Q-learning , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[11]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[12]  R. Rob,et al.  Learning, Mutation, and Long Run Equilibria in Games , 1993 .

[13]  Holly A. Yanco,et al.  An adaptive communication protocol for cooperating mobile robots , 1993 .

[14]  D. Fudenberg,et al.  Steady state learning and Nash equilibrium , 1993 .

[15]  Gerhard Weiss,et al.  Learning to Coordinate Actions in Multi-Agent-Systems , 1993, IJCAI.

[16]  E. Kalai,et al.  Rational Learning Leads to Nash Equilibrium , 1993 .

[17]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[18]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[19]  Sandip Sen,et al.  Multiagent Coordination with Learning Classifier Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[20]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21]  L. Shapley,et al.  Fictitious Play Property for Games with Identical Interests , 1996 .

[22]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[23]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[24]  Junling Hu,et al.  Self-fulfilling Bias in Multiagent Learning , 1996 .

[25]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[26]  Craig Boutilier,et al.  Learning Conventions in Multiagent Stochastic Domains using Likelihood Estimates , 1996, UAI.

[27]  V. Borkar Asynchronous Stochastic Approximations , 1998 .

引用
Uncovering demand flexibility in buildings : a smart grid inter-operation framework for the optimization of energy and comfort
2017
Multiagent Learning Paradigms
EUMAS/AT
2017
Learning to Play: Reinforcement Learning and Games
2020
Multi-Agent Reinforcement Learning: A Survey
2006
CLEAN Learning to Improve Coordination and Scalability in Multiagent Systems
2013
Reinforcement Learning for Demand Response of Domestic Household Appliances
2018
Learning by Experience and by Imitation in Multi-Robot Systems
2008
Achieving Coverage through Distributed Reinforcement Learning in Wireless Sensor Networks
2007 3rd International Conference on Intelligent Sensors, Sensor Networks and Information
2007
Cooperative reinforcement learning algorithm to distributed power system based on Multi-Agent
2009 3rd International Conference on Power Electronics Systems and Applications (PESA)
2009
Distributed Learning Based Joint Communication and Computation Strategy of IoT Devices in Smart Cities
Sensors
2020
Intelligent Link Adaptation for Grant-Free Access Cellular Networks: A Distributed Deep Reinforcement Learning Approach
ArXiv
2021
Spectrum Access In Cognitive Radio Using a Two-Stage Reinforcement Learning Approach
IEEE Journal of Selected Topics in Signal Processing
2017
Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings
ArXiv
2018
A Round-robin Scheduling Algorithm of Relay-nodes in WSN Based on Self-adaptive Weighted Learning for Environment Monitoring
J. Comput.
2014
Scalable Optimization for Wind Farm Control using Coordination Graphs
AAMAS
2021
Understanding Structure of Concurrent Actions
SGAI Conf.
2019
LikelihoodQuantile Networks for Coordinating Multi-Agent Reinforcement Learning
2020
Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning
AAMAS
2018
Adaptive State Representations for Multi-agent Reinforcement Learning
ICAART
2011
HPLAN: Facilitating the Implementation of Joint Human-Agent Activities
PAAMS
2014