论文信息 - On the convergence of regret minimization dynamics in concave games

On the convergence of regret minimization dynamics in concave games

We study a general sub-class of concave games which we call socially concave games. We show that if each player follows any no-external regret minimization procedure then the dynamics will converge in the sense that both the average action vector will converge to a Nash equilibrium and that the utility of each player will converge to her utility in that Nash equilibrium. We show that many natural games are indeed socially concave games. Specifically, we show that linear Cournot competition and linear resource allocation games are socially-concave games, and therefore our convergence result applies to them. In addition, we show that a simple best response dynamics might diverge for linear resource allocation games, and is known to diverge for linear Cournot competition. For the TCP congestion games we show that "near" the equilibrium the games are socially-concave, and using our general methodology we show the convergence of a specific regret minimization dynamics.

[1] A. Cournot. Researches into the Mathematical Principles of the Theory of Wealth , 1898, Forerunners of Realizable Values Accounting in Financial Reporting.

[2] R. D. Theocharis. On the Stability of the Cournot Solution on the Oligopoly Problem , 1960 .

[3] J. Goodman. Note on Existence and Uniqueness of Equilibrium Points for Concave N-Person Games , 1965 .

[4] QUTdN QeO,et al. Random Early Detection Gateways for Congestion Avoidance , 1993 .

[5] A. Mas-Colell,et al. Microeconomic Theory , 1995 .

[6] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[7] Ilya Segal,et al. Solutions manual for Microeconomic theory : Mas-Colell, Whinston and Green , 1997 .

[8] Frank Kelly,et al. Charging and rate control for elastic traffic , 1997, Eur. Trans. Telecommun..

[9] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[10] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .

[11] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .

[12] Richard M. Karp,et al. Optimization problems in congestion control , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[13] J. Shawe-Taylor. Potential-Based Algorithms in On-Line Prediction and Game Theory ∗ , 2001 .

[14] Sanjeev Arora,et al. A Randomized Online Algorithm for Bandwidth Utilization , 2002, SODA '02.

[15] Bruce Hajek,et al. Do Greedy Autonomous Systems Make for a Sensible Internet , 2003 .

[16] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[17] H. Peyton Young,et al. Learning, hypothesis testing, and Nash equilibrium , 2003, Games Econ. Behav..

[18] Stochastic Uncoupled Dynamics and Nash Equilibrium , 2004 .

[19] Martin A. Zinkevich,et al. Theoretical guarantees for algorithms in multi-agent settings , 2004 .

[20] Sham M. Kakade,et al. Deterministic calibration and Nash equilibrium , 2004, J. Comput. Syst. Sci..

[21] H. Peyton Young,et al. Strategic Learning and Its Limits , 2004 .

[22] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[23] John N. Tsitsiklis,et al. Efficiency loss in a network resource allocation game: the case of elastic supply , 2004, IEEE Transactions on Automatic Control.

[24] Dean Phillips Foster,et al. Regret Testing: Learning to Play Nash Equilibrium Without Knowing You Have an Opponent , 2006 .

[25] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[26] Andreu Mas-Colell,et al. Stochastic Uncoupled Dynamics and Nash Equilibrium , 2004, Games Econ. Behav..

[27] Avrim Blum,et al. Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games , 2006, PODC '06.

[28] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..

[29] Yishay Mansour,et al. The communication complexity of uncoupled nash equilibrium procedures , 2007, STOC '07.

[30] Fabrizio Germano,et al. Global Nash Convergence of Foster and Young's Regret Testing , 2004, Games Econ. Behav..

[31] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[32] Peter L. Bartlett,et al. Adaptive Online Gradient Descent , 2007, NIPS.

[33] Mohammad Taghi Hajiaghayi,et al. Regret minimization and the price of total anarchy , 2008, STOC.