A Probabilistic Ensemble Pruning Algorithm

An ensemble is a group of learners that work together as a committee to solve a problem. However, the existing ensemble training algorithms sometimes generate unnecessary large ensembles, which consume extra computational resource and may degrade the performance. Ensemble pruning algorithm aims to find a good subset of ensemble members to constitute a small ensemble, which saves the computational resource and performs as well as, or better than, the non-pruned ensemble. This paper introduces a probabilistic ensemble pruning algorithm by choosing a set of "sparse" combination weights, most of which are zero, to prune the large ensemble. In order to obtain the set of sparse combination weights and satisfy the non-negative restriction of the combination weights, a left-truncated, non-negative, Gaussian prior is adopted over every combination weight. Expectation-maximization algorithm is employed to obtain maximum a posterior (MAP) estimation of weight vector. Four benchmark regression problems and another four benchmark classification problems have been employed to demonstrate the effectiveness of the method