A heterogeneous ensemble of trees

Decision Tree is a simple but popular machine learning algorithm. Although a single decision tree is not as accurate as other state-of-the-art classifiers, the performance can be significantly improved by combining the predictions of several decision trees i.e. by creating an ensemble of trees. In this paper, we study decision trees and their ensembles viz. Bagged Decision Trees, Random Forest, Extremely Randomized Trees, Rotation Forest, Gradient Boosted Trees and AdaBoosted Trees, and assess their performance on several UCI datasets. In addition, we propose a new ensemble method, Heterogeneous Ensemble of trees, and compare its performance with existing tree based classifiers. The heterogeneous ensemble is built with three different ensemble of trees (Random Forest, Rotation Forest, and Extremely Randomized Trees) with equal proportions to boost the diversity of the trees in the ensemble. A weightage scheme based on out-of-bag error is employed to combine the prediction of various trees for the final output prediction. Based on the experiments performed on several UCI datasets, the Heterogeneous Ensemble of trees obtains the best rank compared with other tree based classifiers.

[1]  Ponnuthurai N. Suganthan,et al.  Towards generating random forests via extremely randomized trees , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[4]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Ponnuthurai N. Suganthan,et al.  Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article] , 2016, IEEE Computational Intelligence Magazine.

[7]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[8]  Derek Partridge,et al.  Hybrid ensembles and coincident-failure diversity , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[9]  Ponnuthurai N. Suganthan,et al.  Oblique Decision Tree Ensemble via Multisurface Proximal Support Vector Machine , 2015, IEEE Transactions on Cybernetics.

[10]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[11]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[12]  David C. Yen,et al.  Predicting stock returns by classifier ensembles , 2011, Appl. Soft Comput..

[13]  P. N. Suganthan,et al.  An Ensemble of Kernel Ridge Regression for Multi-class Classification , 2017, ICCS.

[14]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[15]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[16]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[17]  Tony R. Martinez,et al.  Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[18]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[19]  Ponnuthurai N. Suganthan,et al.  Instance based random forest with rotated feature space , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[20]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[21]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Narendra Ahuja,et al.  Robust Visual Tracking Using Oblique Random Forests , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).