论文信息 - Blackwell Approachability and No-Regret Learning are Equivalent - 字舞流文

Blackwell Approachability and No-Regret Learning are Equivalent

We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payos. Blackwell himself previously showed that the theorem implies the existence of a \no-regret" algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimization for Online Linear Optimization. We show that any algorithm for one such problem can be eciently converted into an algorithm for the other. We provide one novel application of this reduction: the rst ecient algorithm for calibrated forecasting.

Peter L. Bartlett | Elad Hazan | Jacob D. Abernethy | P. Bartlett | Elad Hazan | J. Abernethy

[1] J. Neumann,et al. Theory of Games and Economic Behavior. , 1945 .

[2] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.

[3] Philip Wolfe,et al. Contributions to the theory of games , 1953 .

[4] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .

[5] M. Sion. On general minimax theorems , 1958 .

[6] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[7] A. Dawid. The Well-Calibrated Bayesian , 1982 .

[8] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9] S. Hart,et al. A simple adaptive procedure leading to correlated equilibrium , 2000 .

[10] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[11] D. Fudenberg,et al. An Easier Way to Calibrate , 1999 .

[12] Dean P. Foster,et al. A Proof of Calibration Via Blackwell's Approachability Theorem , 1999 .

[13] Alvaro Sandroni,et al. Calibration with Many Checking Rules , 2003, Math. Oper. Res..

[14] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[15] Yoram Singer,et al. Convex Repeated Games and Fenchel Duality , 2006, NIPS.

[16] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[17] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[18] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.

[19] Shie Mannor,et al. Regret minimization in repeated matrix games with variable stage duration , 2008, Games Econ. Behav..

[20] Shie Mannor,et al. Online Learning for Global Cost Functions , 2009, COLT.

[21] Peter L. Bartlett,et al. A Stochastic View of Optimal Regret through Minimax Duality , 2009, COLT.

[22] Vianney Perchet,et al. Calibration and Internal No-Regret with Random Signals , 2009, ALT.

[23] D. Blackwell. Controlled Random Walks , 2010 .

[24] Shie Mannor,et al. A Geometric Proof of Calibration , 2009, Math. Oper. Res..

[25] Elad Hazan,et al. An optimal algorithm for stochastic strongly-convex optimization , 2010, 1006.2425.

[26] Ambuj Tewari,et al. Online Learning: Beyond Regret , 2010, COLT.

[27] Elad Hazan. The convex optimization approach to regret minimization , 2011 .