Learning with Ensembles: How Over--tting Can Be Useful

We study the general characteristics of learning with ensembles. Solving exactly the simple model scenario of an ensemble of linear students, we nd surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students , which actually overt the training data. Globally optimal generalization performance can be obtained by choosing the training set sizes of the students appropriately. For smaller ensembles, optimization of the ensemble weights can yield signiicant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of noise in the training data.