论文信息 - A Stochastic View of Optimal Regret through Minimax Duality - 字舞流文

A Stochastic View of Optimal Regret through Minimax Duality

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary.

Peter L. Bartlett | Alekh Agarwal | Jacob D. Abernethy | Alexander Rakhlin | P. Bartlett | Alekh Agarwal | J. Abernethy | A. Rakhlin

[1] R. Latala,et al. On the best constant in the Khinchin-Kahane inequality , 1994 .

[2] Peter L. Bartlett,et al. The importance of convexity in learning with squared loss , 1998, COLT '96.

[3] Yoav Freund,et al. Predicting a binary sequence almost as well as the optimal biased coin , 2003, COLT '96.

[4] Vladimir Vovk,et al. Competitive On-line Linear Regression , 1997, NIPS.

[5] Neri Merhav,et al. Universal Prediction , 1998, IEEE Trans. Inf. Theory.

[6] Peter L. Bartlett,et al. The Importance of Convexity in Learning with Squared Loss , 1998, IEEE Trans. Inf. Theory.

[7] E. Giné,et al. Decoupling: From Dependence to Independence , 1998 .

[8] G. Lugosi,et al. On Prediction of Individual Sequences , 1998 .

[9] G. Lugosi,et al. On Prediction of Individual Sequences , 1998 .

[10] Manfred K. Warmuth,et al. The Minimax Strategy for Gaussian Density Estimation. pp , 2000, COLT.

[11] Adrian S. Lewis,et al. Convex Analysis And Nonlinear Optimization , 2000 .

[12] E. Takimoto,et al. The Minimax Strategy for Gaussian Density Estimation , 2000 .

[13] Shahar Mendelson,et al. A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.

[14] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[15] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .

[16] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[17] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[18] Peter L. Bartlett,et al. Adaptive Online Gradient Descent , 2007, NIPS.

[19] Ambuj Tewari,et al. Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.

[20] P. Bartlett,et al. Optimal strategies and minimax lower bounds for online convex games [Technical Report No. UCB/EECS-2008-19] , 2008 .

[21] Maya R. Gupta,et al. Functional Bregman Divergence and Bayesian Estimation of Distributions , 2006, IEEE Transactions on Information Theory.

[22] Shahar Mendelson,et al. Lower Bounds for the Empirical Minimization Algorithm , 2008, IEEE Transactions on Information Theory.

[23] Convex Games in Banach Spaces , 2010, COLT.

[24] Convex Games in Banach Spaces , 2010, COLT.

[25] S. Mendelson,et al. Sharper lower bounds on the performance of the empirical risk minimization algorithm , 2011, 1102.4983.