Optimal Sample-Based Estimates of the Expectation of the Empirical Minimizer

We study sample-based estimates of the expectation of the function produced by the empirical minimization algorithm. We investigate the extent to which one can estimate the rate of convergence of the empirical minimizer in a data dependent manner. We establish three main results. First, we provide an algorithm that upper bounds the expectation of the empirical minimizer in a completely data-dependent manner. This bound is based on a structural result in [3], which relates expectations to sample averages. Second, we show that these structural

[1]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[2]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[3]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[4]  Thierry Klein Une inégalité de concentration à gauche pour les processus empiriques , 2002 .

[5]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[6]  M. Rudelson,et al.  Combinatorics of random processes and sections of convex bodies , 2004, math/0404192.

[7]  M. Talagrand Sharper Bounds for Gaussian and Empirical Processes , 1994 .

[8]  E. Rio,et al.  Inégalités de concentration pour les processus empiriques de classes de parties , 2001 .

[9]  S. R. Jammalamadaka,et al.  Empirical Processes in M-Estimation , 2001 .

[10]  P. Massart,et al.  About the constants in Talagrand's concentration inequalities for empirical processes , 2000 .

[11]  A. W. van der Vaart,et al.  Uniform Central Limit Theorems , 2001 .

[12]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[13]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[14]  Shahar Mendelson,et al.  Improving the sample complexity using global data , 2002, IEEE Trans. Inf. Theory.

[15]  G. Lugosi,et al.  Complexity regularization via localized random penalties , 2004, math/0410091.

[16]  Peter L. Bartlett,et al.  The importance of convexity in learning with squared loss , 1998, COLT '96.

[17]  David Haussler,et al.  Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.

[18]  V. Koltchinskii,et al.  Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.

[19]  S. Geer Empirical Processes in M-Estimation , 2000 .

[20]  M. Talagrand New concentration inequalities in product spaces , 1996 .

[21]  P. MassartLedoux Concentration Inequalities Using the Entropy Method , 2002 .

[22]  P. Massart,et al.  Risk bounds for statistical learning , 2007, math/0702683.

[23]  M. Ledoux The concentration of measure phenomenon , 2001 .

[24]  Shahar Mendelson,et al.  A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.

[25]  O. Bousquet Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms , 2002 .

[26]  P. Massart Some applications of concentration inequalities to statistics , 2000 .