Best of many worlds: Robust model selection for online supervised learning

We introduce algorithms for online, full-information prediction that are competitive with contextual tree experts of unknown complexity, in both probabilistic and adversarial settings. We show that by incorporating a probabilistic framework of structural risk minimization into existing adaptive algorithms, we can robustly learn not only the presence of stochastic structure when it exists (leading to constant as opposed to $\mathcal{O}(\sqrt{T})$ regret), but also the correct model order. We thus obtain regret bounds that are competitive with the regret of an optimal algorithm that possesses strong side information about both the complexity of the optimal contextual tree expert and whether the process generating the data is stochastic or adversarial. These are the first constructive guarantees on simultaneous adaptivity to the model and the presence of stochasticity.

[1]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[2]  Francesco Orabona,et al.  Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning , 2014, NIPS.

[3]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[4]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[5]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[6]  Haipeng Luo,et al.  Achieving All with No Parameters: AdaNormalHedge , 2015, COLT.

[7]  Haipeng Luo,et al.  Corralling a Band of Bandit Algorithms , 2016, COLT.

[8]  Peter L. Bartlett,et al.  Oracle inequalities for computationally budgeted model selection , 2011, COLT.

[9]  Francesco Orabona,et al.  Coin Betting and Parameter-Free Online Learning , 2016, NIPS.

[10]  Wouter M. Koolen,et al.  Second-order Quantile Methods for Experts and Combinatorial Games , 2015, COLT.

[11]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[12]  Wouter M. Koolen,et al.  Follow the leader if you can, hedge if you must , 2013, J. Mach. Learn. Res..

[13]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[14]  Mehryar Mohri,et al.  Parameter-Free Online Learning via Model Selection , 2017, NIPS.

[15]  Wouter M. Koolen,et al.  Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning , 2016, NIPS.

[16]  Wouter M. Koolen,et al.  MetaGrad: Multiple Learning Rates in Online Learning , 2016, NIPS.

[17]  Wouter M. Koolen,et al.  Adaptive Hedge , 2011, NIPS.

[18]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[19]  Robert E. Schapire,et al.  Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.

[20]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[21]  Yishay Mansour,et al.  Improved second-order bounds for prediction with expert advice , 2006, Machine Learning.