Oracle inequalities for computationally adaptive model selection

We analyze general model selection procedures using penalized empirical loss minimization under computational constraints. While classical model selection approaches do not consider computational aspects of performing model selection, we argue that any practical model selection procedure must not only trade off estimation and approximation error, but also the computational effort required to compute empirical minimizers for different function classes. We provide a framework for analyzing such problems, and we give algorithms for model selection under a computational budget. These algorithms satisfy oracle inequalities that show that the risk of the selected model is not much worse than if we had devoted all of our omputational budget to the optimal function class.

[1]  Shun-ichi Amari,et al.  A Theory of Pattern Recognition , 1968 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  C. L. Mallows Some comments on C_p , 1973 .

[4]  H. Akaike A new look at the statistical model identification , 1974 .

[5]  S. Geman,et al.  Nonparametric Maximum Likelihood Estimation by the Method of Sieves , 1982 .

[6]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[7]  Andrew R. Barron,et al.  Complexity Regularization with Application to Artificial Neural Networks , 1991 .

[8]  C. Mallows More comments on C p , 1995 .

[9]  P. Massart,et al.  Risk bounds for model selection via penalization , 1999 .

[10]  E. Mammen,et al.  Smooth Discrimination Analysis , 1999 .

[11]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[12]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[13]  A. W. van der Vaart,et al.  Uniform Central Limit Theorems , 2001 .

[14]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[15]  Peter L. Bartlett,et al.  Model Selection and Error Estimation , 2000, Machine Learning.

[16]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[17]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[18]  G. Lugosi,et al.  Complexity regularization via localized random penalties , 2004, math/0410091.

[19]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[20]  P. Bartlett,et al.  Empirical minimization , 2006 .

[21]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[22]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[23]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[24]  V. Koltchinskii Rejoinder: Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0135.

[25]  H. Robbins A Stochastic Approximation Method , 1951 .

[26]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[27]  P. Bartlett FAST RATES FOR ESTIMATION ERROR AND ORACLE INEQUALITIES FOR MODEL SELECTION , 2008, Econometric Theory.

[28]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[29]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[30]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .