The Effectiveness of a New Negative Correlation Learning Algorithm for Classification Ensembles

In an earlier paper, we proposed a new negative correlation learning (NCL) algorithm for classification ensembles, called AdaBoost.NC, which has significantly better performance than the standard AdaBoost and other NCL algorithms on many benchmark data sets with low computation cost. In this paper, we give deeper insight into this algorithm from both theoretical and experimental aspects to understand its effectiveness. We explain why AdaBoost.NC can reduce error correlation within the ensemble and improve the classification performance. We also show the role of the $amb$ (penalty) term in the training error. Finally, we examine the effectiveness of AdaBoost.NC by varying two pre-defined parameters – penalty strength $\lambda$ and ensemble size T. Experiments are carried out on both artificial and real-world data sets, which show that AdaBoost.NC does produce smaller error correlation along with training epochs, and a lower test error comparing to the standard AdaBoost. The optimal $\lambda$ depends on problem domains and base learners. The performance of AdaBoost.NC becomes stable as T gets larger. It is more effective when T is comparatively small.

[1]  Alejandro Murua,et al.  Upper Bounds for Error Rates of Linear Combinations of Classifiers , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[3]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[4]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[5]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[6]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[7]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[8]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[9]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[10]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[11]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[12]  Kamal A. Ali,et al.  On the Link between Error Correlation and Error Reduction in Decision Tree Ensembles , 1995 .

[13]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[16]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[17]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[18]  Huanhuan Chen,et al.  Negative correlation learning for classification ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).