AdaBoost is Consistent

The risk, or probability of error, of the classifier produced by the AdaBoost algorithm is investigated. In particular, we consider the stopping strategy to be used in AdaBoost to achieve universal consistency. We show that provided AdaBoost is stopped after nv iterations—for sample size n and v 0.

[1]  D. Pollard Convergence of stochastic processes , 1984 .

[2]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[3]  D. Pollard Empirical Processes: Theory and Applications , 1990 .

[4]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[5]  Corinna Cortes,et al.  Boosting Decision Trees , 1995, NIPS.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[8]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[9]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[10]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[11]  L. Breiman Arcing the edge , 1997 .

[12]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[13]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[14]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[15]  Peter L. Bartlett,et al.  Learning in Neural Networks: Theoretical Foundations , 1999 .

[16]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[17]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[18]  L. Breiman SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES , 2000 .

[19]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[20]  Shie Mannor,et al.  Weak Learners and Improved Rates of Convergence in Boosting , 2000, NIPS.

[21]  A. W. van der Vaart,et al.  Uniform Central Limit Theorems , 2001 .

[22]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[23]  Wenxin Jiang On weak base hypotheses and their implications for boosting regression and classification , 2002 .

[24]  Michael I. Jordan,et al.  Discussion of Boosting Papers , 2003 .

[25]  Wenxin Jiang Process consistency for AdaBoost , 2003 .

[26]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[27]  Shie Mannor,et al.  Greedy Algorithms for Classification -- Consistency, Convergence Rates, and Adaptivity , 2003, J. Mach. Learn. Res..

[28]  G. Lugosi,et al.  On the Bayes-risk consistency of regularized boosting methods , 2003 .

[29]  W. Wong,et al.  On ψ-Learning , 2003 .

[30]  Gilles Blanchard,et al.  On the Rate of Convergence of Regularized Boosting Classifiers , 2003, J. Mach. Learn. Res..

[31]  Trevor Hastie,et al.  Discussion of Boosting Papers , 2003 .

[32]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[33]  Y. Freund,et al.  A Discussion of: "Process Consistency for AdaBoost" by Wenxin Jiang "On the Bayes-risk consistency of regularized boosting methods" by G´ abor Lugosi and Nicolas Vayatis "Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization" by Tong Zhang , 2004 .

[34]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[35]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[36]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[37]  Bin Yu,et al.  Boosting with early stopping: Convergence and consistency , 2005, math/0508276.

[38]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[39]  Robert E. Schapire,et al.  How boosting the margin can also boost classifier complexity , 2006, ICML.

[40]  P. Bickel,et al.  Some Theory for Generalized Boosting Algorithms , 2006, J. Mach. Learn. Res..