Geometric Bounds for Generalization in Boosting

We consider geometric conditions on a labeled data set which guarantee that boosting algorithms work well when linear classifiers are used as weak learners.W e start by providing conditions on the error of the weak learner which guarantee that the empirical error of the composite classifier is small.W e then focus on conditions required in order to insure that the linear weak learner itself achieves an error which is smaller than 1/2 - γ, where the advantage parameter γ is strictly positive and independent of the sample size. Such a condition guarantees that the generalization error of the boosted classifier decays to its minimal value at a rate of 1/√m, where m is the sample size. The required conditions, which are based solely on geometric concepts, can be easily verified for any data set in time O(m2), and may serve as an indication for the effectiveness of linear classifiers as weak learners for a particular data set.