Open Problem: Shifting Experts on Easy Data

A number of online algorithms have been developed that have small additional loss (regret) compared to the best \shifting expert". In this model, there is a set of experts and the comparator is the best partition of the trial sequence into a small number of segments, where the expert of smallest loss is chosen in each segment. The regret is typically dened for worst-case data / loss sequences. There has been a recent surge of interest in online algorithms that combine good worstcase guarantees with much improved performance on easy data. A practically relevant class of easy data is the case when the loss of each expert is iid and the best and second best experts have a gap between their mean loss. In the full information setting, the FlipFlop algorithm by De Rooij et al. (2014) combines the best of the iid optimal Follow-The-Leader (FL) and the worst-case-safe Hedge algorithms, whereas in the bandit information case SAO by Bubeck and Slivkins (2012) competes with the iid optimal UCB and the worst-case-safe EXP3. We ask the same question for the shifting expert problem. First, we ask what are the simple and ecient algorithms for the shifting experts problem when the loss sequence in each segment is iid with respect to a xed but unknown distribution. Second, we ask how to eciently unite the performance of such algorithms on easy data with worst-case robustness. A particular intriguing open problem is the case when the comparator shifts within a small subset of experts from a large set under the assumption that the losses in each segment are iid.

[1]  Wouter M. Koolen,et al.  Putting Bayes to sleep , 2012, NIPS.

[2]  Marcus Hutter,et al.  Adaptive Online Prediction by Following the Perturbed Leader , 2005, J. Mach. Learn. Res..

[3]  Mark Herbster,et al.  Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..

[4]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[5]  Wouter M. Koolen,et al.  Follow the leader if you can, hedge if you must , 2013, J. Mach. Learn. Res..

[6]  Wojciech Kotlowski,et al.  Follow the Leader with Dropout Perturbations , 2014, COLT.

[7]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[8]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[9]  Manfred K. Warmuth,et al.  Repeated Games against Budgeted Adversaries , 2010, NIPS.

[10]  Manfred K. Warmuth,et al.  Tracking a Small Set of Experts by Mixing Past Posteriors , 2003, J. Mach. Learn. Res..

[11]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[12]  Nicolò Cesa-Bianchi,et al.  Mirror Descent Meets Fixed Share (and feels no regret) , 2012, NIPS.

[13]  Wouter M. Koolen,et al.  Adaptive Hedge , 2011, NIPS.

[14]  Aleksandrs Slivkins,et al.  25th Annual Conference on Learning Theory The Best of Both Worlds: Stochastic and Adversarial Bandits , 2022 .

[15]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..