Entropy Regularized LPBoost

In this paper we discuss boosting algorithms that maximize the soft margin of the produced linear combination of base hypotheses. LPBoost is the most straightforward boosting algorithm for doing this. It maximizes the soft margin by solving a linear programming problem. While it performs well on natural data, there are cases where the number of iterations is linear in the number of examples instead of logarithmic. By simply adding a relative entropy regularization to the linear objective of LPBoost, we arrive at the Entropy Regularized LPBoost algorithm for which we prove a logarithmic iteration bound. A previous algorithm, called SoftBoost, has the same iteration bound, but the generalization error of this algorithm often decreases slowly in early iterations. Entropy Regularized LPBoost does not suffer from this problem and has a simpler, more natural motivation.

[1]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[2]  Yoram Singer,et al.  A Comparison of New and Old Algorithms for a Mixture Estimation Problem , 1995, COLT '95.

[3]  Gunnar Rätsch,et al.  Boosting Algorithms for Maximizing the Soft Margin , 2007, NIPS.

[4]  Yoram Singer,et al.  On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.

[5]  Gunnar Rätsch,et al.  Robust Boosting via Convex Optimization: Theory and Applications , 2007 .

[6]  David P. Helmbold,et al.  Potential Boosters? , 1999, NIPS.

[7]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[8]  Babak Hassibi,et al.  The p-norm generalization of the LMS algorithm for adaptive filtering , 2003, IEEE Transactions on Signal Processing.

[9]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[10]  Gunnar Rätsch,et al.  Totally corrective boosting algorithms that maximize the margin , 2006, ICML.

[11]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[12]  R. Schapire,et al.  Analysis of boosting algorithms using the smooth margin function , 2007, 0803.4092.

[13]  Manfred K. Warmuth,et al.  Boosting as entropy projection , 1999, COLT '99.

[14]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[15]  N. Abe,et al.  Polynomial Learnability of Stochastic Rules with Respect to the KL-Divergence and Quadratic Distance , 2001 .

[16]  J. Lafferty Additive models, boosting, and inference for generalized divergences , 1999, COLT '99.

[17]  Yoav Freund,et al.  An Adaptive Version of the Boost by Majority Algorithm , 1999, COLT '99.

[18]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[19]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[20]  Gunnar Rätsch,et al.  Efficient Margin Maximizing with Boosting , 2005, J. Mach. Learn. Res..

[21]  Alexander J. Smola,et al.  Bundle Methods for Machine Learning , 2007, NIPS.

[22]  Cynthia Rudin,et al.  The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins , 2004, J. Mach. Learn. Res..

[23]  Gunnar Rätsch,et al.  Robust Ensemble Learning , 2000 .

[24]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[25]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[26]  Cynthia Rudin,et al.  Boosting Based on a Smooth Margin , 2004, COLT.

[27]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.