Towards generating random forests via extremely randomized trees

The classification error of a specified classifier can be decomposed into bias and variance. Decision tree based classifier has very low bias and extremely high variance. Ensemble methods such as bagging can significantly reduce the variance of such unstable classifiers and thus return an ensemble classifier with promising generalized performance. In this paper, we compare different tree-induction strategies within a uniform ensemble framework. The results on several public datasets show that random partition (cut-point for univariate decision tree or both coefficients and cut-point for multivariate decision tree) without exhaustive search at each node of a decision tree can yield better performance with less computational complexity.

[1]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[2]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[3]  Lawrence O. Hall,et al.  A Comparison of Ensemble Creation Techniques , 2004, Multiple Classifier Systems.

[4]  Weiqiang Dong On Bias , Variance , 0 / 1-Loss , and the Curse of Dimensionality RK April 13 , 2014 .

[5]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[6]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[7]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[8]  Bjoern H Menze,et al.  Multivariate feature selection and hierarchical classification for infrared spectroscopy: serum-based detection of bovine spongiform encephalopathy , 2007, Analytical and bioanalytical chemistry.

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[10]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[11]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[13]  Steven L. Salzberg,et al.  On growing better decision trees from data , 1996 .

[14]  Gareth James,et al.  Variance and Bias for General Loss Functions , 2003, Machine Learning.

[15]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[16]  Chun-Xia Zhang,et al.  RotBoost: A technique for combining Rotation Forest and AdaBoost , 2008, Pattern Recognit. Lett..

[17]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[18]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[19]  Antonio Criminisi,et al.  Decision Forests with Long-Range Spatial Context for Organ Localization in CT Volumes , 2009 .

[20]  Chong Jin Ong,et al.  A Feature Selection Method for Multilevel Mental Fatigue EEG Classification , 2007, IEEE Transactions on Biomedical Engineering.

[21]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Jun Chen,et al.  Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes , 2004, BMC Bioinformatics.

[24]  Fei-Fei Li,et al.  Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[25]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[26]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[27]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[28]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[29]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[30]  Bjoern H. Menze,et al.  A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data , 2009, BMC Bioinformatics.

[31]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..

[32]  Giorgio Valentini,et al.  Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods , 2004, J. Mach. Learn. Res..

[33]  Tin Kam Ho,et al.  A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors , 2002, Pattern Analysis & Applications.

[34]  Hansheng Wang,et al.  Subgroup Analysis via Recursive Partitioning , 2009, J. Mach. Learn. Res..

[35]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[36]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.