Boosting Algorithms for Maximizing the Soft Margin

We present a novel boosting algorithm, called SoftBoost, designed for sets of binary labeled examples that are not necessarily separable by convex combinations of base hypotheses. Our algorithm achieves robustness by capping the distributions on the examples. Our update of the distribution is motivated by minimizing a relative entropy subject to the capping constraints and constraints on the edges of the obtained base hypotheses. The capping constraints imply a soft margin in the dual optimization problem. Our algorithm produces a convex combination of hypotheses whose soft margin is within δ of its maximum. We employ relative entropy projection methods to prove an O(ln N/δ2) iteration bound for our algorithm, where N is number of examples. We compare our algorithm with other approaches including LPBoost, Brown-Boost, and SmoothBoost. We show that there exist cases where the number of iterations required by LPBoost grows linearly in N instead of the logarithmic growth for SoftBoost. In simulation studies we show that our algorithm converges about as fast as LPBoost, faster than BrownBoost, and much faster than SmoothBoost. In a benchmark comparison we illustrate the competitiveness of our approach.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  Kenneth O. Kortanek,et al.  Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[4]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[6]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[7]  Y. Censor,et al.  Parallel Optimization: Theory, Algorithms, and Applications , 1997 .

[8]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[9]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[10]  Yoav Freund,et al.  An Adaptive Version of the Boost by Majority Algorithm , 1999, COLT.

[11]  Manfred K. Warmuth,et al.  Boosting as entropy projection , 1999, COLT '99.

[12]  Gunnar Rätsch,et al.  Robust Ensemble Learning , 2000 .

[13]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[14]  Osamu Watanabe,et al.  MadaBoost: A Modification of AdaBoost , 2000, COLT.

[15]  Mark Herbster,et al.  Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..

[16]  G. Rätsch Robust Boosting via Convex Optimization , 2001 .

[17]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[18]  Rocco A. Servedio,et al.  Smooth boosting and learning with malicious noise , 2003 .

[19]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[20]  Gunnar Rätsch,et al.  Advanced Lectures on Machine Learning , 2004, Lecture Notes in Computer Science.

[21]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[22]  Cynthia Rudin,et al.  The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins , 2004, J. Mach. Learn. Res..

[23]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[24]  Gunnar Rätsch,et al.  Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces , 2002, Machine Learning.

[25]  Gunnar Rätsch,et al.  Efficient Margin Maximizing with Boosting , 2005, J. Mach. Learn. Res..

[26]  Gunnar Rätsch,et al.  Totally corrective boosting algorithms that maximize the margin , 2006, ICML.

[27]  Gunnar Rätsch,et al.  Robust Boosting via Convex Optimization: Theory and Applications , 2007 .