论文信息 - Regret Minimization for Branching Experts

Regret Minimization for Branching Experts

We study regret minimization bounds in which the dependence on the number of experts is replaced by measures of the realized complexity of the expert class. The measures we consider are defined in retrospect given the realized losses. We concentrate on two interesting cases. In the first, our measure of complexity is the number of different “leading experts”, namely, experts that were best at some point in time. We derive regret bounds that depend only on this measure, independent of the total number of experts. We also consider a case where all experts remain grouped in just a few clusters in terms of their realized cumulative losses. Here too, our regret bounds depend only on the number of clusters determined in retrospect, which serves as a measure of complexity. Our results are obtained as special cases of a more general analysis for a setting of branching experts, where the set of experts may grow over time according to a tree-like structure, determined by an adversary. For this setting of branching experts, we give algorithms and analysis that cover both the full information and the bandit scenarios.

[1] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.

[2] Wouter M. Koolen,et al. Putting Bayes to sleep , 2012, NIPS.

[3] Yishay Mansour,et al. From External to Internal Regret , 2005, J. Mach. Learn. Res..

[4] Ohad Shamir,et al. Relax and Localize: From Value to Algorithms , 2012, ArXiv.

[5] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[6] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[7] Elad Hazan,et al. Extracting certainty from uncertainty: regret bounded by variation in costs , 2008, Machine Learning.

[8] Yishay Mansour,et al. Online trading algorithms and robust option pricing , 2006, STOC '06.

[9] Mark Herbster,et al. Tracking the Best Expert , 1995, Machine-mediated learning.

[10] Vladimir Vovk,et al. Prediction with Advice of Unknown Number of Experts , 2010, UAI.

[11] Manfred K. Warmuth,et al. Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[12] Yoram Singer,et al. Using and combining predictors that specialize , 1997, STOC '97.

[13] Rong Jin,et al. 25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[14] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15] Yoav Freund,et al. A Parameter-free Hedging Algorithm , 2009, NIPS.

[16] Yishay Mansour,et al. Pricing Exotic Derivatives Using Regret Minimization , 2011, SAGT.

[17] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[18] Cosma Rohilla Shalizi,et al. Adapting to Non-stationarity with Growing Expert Ensembles , 2011, ArXiv.

[19] Robert D. Kleinberg,et al. Regret bounds for sleeping experts and bandits , 2010, Machine Learning.

[20] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .

[21] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[22] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .