论文信息 - Robust approachability and regret minimization in games with partial monitoring

Robust approachability and regret minimization in games with partial monitoring

Approachability has become a standard tool in analyzing earning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficient algorithms (i.e., with constant per-step complexity) for this setup. We finally consider external regret and internal regret in repeated games with partial monitoring and derive regret-minimizing strategies based on approachability theory.

Shie Mannor | Vianney Perchet | Gilles Stoltz

[1] Vianney Perchet,et al. Approachability of Convex Sets in Games with Partial Monitoring , 2011, J. Optim. Theory Appl..

[2] Shie Mannor,et al. On-Line Learning with Imperfect Monitoring , 2003, COLT.

[3] A. Rustichini. Minimizing Regret : The General Case , 1999 .

[4] Jörg Rambau,et al. Projections of polytopes and the generalized baues conjecture , 1996, Discret. Comput. Geom..

[5] Xiaohong Chen,et al. Laws of Large Numbers for Hilbert Space-Valued Mixingales with Applications , 1996, Econometric Theory.

[6] Microeconomics-Charles W. Upton. Repeated games , 2020, Game Theory.

[7] Vianney Perchet,et al. Internal Regret with Partial Monitoring: Calibration-Based Optimal Algorithms , 2011, J. Mach. Learn. Res..

[8] Vianney Perchet,et al. On an unified framework for approachability in games with or without signals , 2013, ArXiv.

[9] Myint Swe Khine. Learning to Play , 2011 .

[10] Andreu Mas-Colell,et al. A General Class of Adaptive Strategies , 1999, J. Econ. Theory.

[11] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[12] Ambuj Tewari,et al. Online Learning: Beyond Regret , 2010, COLT.

[13] Shie Mannor,et al. Regret minimization in repeated matrix games with variable stage duration , 2008, Games Econ. Behav..

[14] Vianney Perchet,et al. Calibration and Internal No-Regret with Random Signals , 2009, ALT.

[15] Christian Schindelhauer,et al. Discrete Prediction Games with Arbitrary Feedback and Loss , 2001, COLT/EuroCOLT.

[16] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .

[17] Nicolò Cesa-Bianchi,et al. Regret Minimization Under Partial Monitoring , 2006, 2006 IEEE Information Theory Workshop - ITW '06 Punta del Este.

[18] John N. Tsitsiklis,et al. Online Learning with Sample Path Constraints , 2009, J. Mach. Learn. Res..

[19] Shie Mannor,et al. A Geometric Proof of Calibration , 2009, Math. Oper. Res..

[20] Dean P. Foster,et al. Regret in the On-Line Decision Problem , 1999 .

[21] D. Blackwell. Controlled Random Walks , 2010 .

[22] Shie Mannor,et al. Strategies for Prediction Under Imperfect Monitoring , 2007, Math. Oper. Res..

[23] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[24] A. Dawid. The Well-Calibrated Bayesian , 1982 .

[25] E. Lehrer,et al. Learning to play partially-specified equilibrium , 2007 .

[26] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[27] Peter L. Bartlett,et al. Blackwell Approachability and No-Regret Learning are Equivalent , 2010, COLT.

[28] John E. Laird,et al. Learning to play , 2009 .

[29] D. Freedman. On Tail Probabilities for Martingales , 1975 .

[30] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..

[31] Joseph O'Rourke,et al. Handbook of Discrete and Computational Geometry, Second Edition , 1997 .