Heterogeneous oblique random forest

Abstract Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parallel hyperplane. Trees with such hyperplanes can better exploit the geometric structure to increase the accuracy of the trees and reduce the depth. The present realizations of oblique decision trees do not evaluate many promising oblique splits to select the best. In this paper, we propose a random forest of heterogeneous oblique decision trees that employ several linear classifiers at each non-leaf node on some top ranked partitions which are obtained via one-vs-all and two-hyperclasses based approaches and ranked based on ideal Gini scores and cluster separability. The oblique hyperplane that optimizes the impurity criterion is then selected as the splitting hyperplane for that node. We benchmark 190 classifiers on 121 UCI datasets. The results show that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically better than all 189 classifiers in the literature.

[1]  Matthieu Guillaumin,et al.  Incremental Learning of Random Forests for Large-Scale Image Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Brendan J. Frey,et al.  Are Random Forests Truly the Best Classifiers? , 2016, J. Mach. Learn. Res..

[3]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[4]  Ponnuthurai N. Suganthan,et al.  Enhancing Multi-Class Classification of Random Forest using Random Vector Functional Neural Network and Oblique Decision Surfaces , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[5]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[6]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[7]  William G. Hanley,et al.  An Extended Study of the Discriminant Random Forest , 2010, Data Mining.

[8]  Le Zhang,et al.  An ensemble of decision trees with random vector functional link networks for multi-class classification , 2017, Appl. Soft Comput..

[9]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[10]  David J. Fleet,et al.  Efficient Non-greedy Optimization of Decision Trees , 2015, NIPS.

[11]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[12]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[13]  Kai Zhao,et al.  Label Distribution Learning Forests , 2017, NIPS.

[14]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[15]  Joachim Denzler,et al.  Semantic Segmentation with Millions of Features: Integrating Multiple Cues in a Combined Random Forest Approach , 2012, ACCV.

[16]  Xiaohui Peng,et al.  A novel random forests based class incremental learning method for activity recognition , 2018, Pattern Recognit..

[17]  Naresh Manwani,et al.  Geometric Decision Tree , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[18]  Bo Wang,et al.  Deep Regression Forests for Age Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[20]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[21]  Narendra Ahuja,et al.  Robust Visual Tracking Using Oblique Random Forests , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Robert Sabourin,et al.  Random forest dissimilarity based multi-view learning for Radiomics application , 2019, Pattern Recognit..

[23]  Guillermo Garcia-Hernando,et al.  Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Francesca Mangili,et al.  Should We Really Use Post-Hoc Tests Based on Mean-Ranks? , 2015, J. Mach. Learn. Res..

[25]  Li Zhang,et al.  Decision Tree Support Vector Machine , 2007, Int. J. Artif. Intell. Tools.

[26]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Shizhong Liao,et al.  Approximate Model Selection for Large Scale LSSVM , 2011, ACML.

[29]  Ullrich Köthe,et al.  On Oblique Random Forests , 2011, ECML/PKDD.

[30]  Olvi L. Mangasarian,et al.  Multisurface proximal support vector machine classification via generalized eigenvalues , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Antonio Criminisi,et al.  Regression forests for efficient anatomy detection and localization in computed tomography scans , 2013, Medical Image Anal..

[32]  Hemant Ishwaran,et al.  A random forests quantile classifier for class imbalanced data , 2019, Pattern Recognit..

[33]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[34]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[35]  Ponnuthurai N. Suganthan,et al.  Oblique Decision Tree Ensemble via Multisurface Proximal Support Vector Machine , 2015, IEEE Transactions on Cybernetics.

[36]  Deolinda M. L. D. Rasteiro,et al.  Hierarchical brain tumour segmentation using extremely randomized trees , 2018, Pattern Recognit..

[37]  P. N. Suganthan,et al.  Benchmarking Ensemble Classifiers with Novel Co-Trained Kernal Ridge Regression and Random Vector Functional Link Ensembles [Research Frontier] , 2017, IEEE Computational Intelligence Magazine.

[38]  Ian Millington,et al.  Artificial Intelligence for Games , 2006, The Morgan Kaufmann series in interactive 3D technology.

[39]  Ponnuthurai N. Suganthan,et al.  Oblique random forest ensemble via Least Square Estimation for time series forecasting , 2017, Inf. Sci..

[40]  S. Abe,et al.  Decision-tree-based multiclass support vector machines , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[41]  Horst Bischof,et al.  Alternating Decision Forests , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Giuliano Armano,et al.  Building forests of local trees , 2018, Pattern Recognit..

[43]  Xianhua Zeng,et al.  Deep forest hashing for image retrieval , 2019, Pattern Recognit..

[44]  Peter Kontschieder,et al.  Neural Decision Forests for Semantic Image Labelling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[46]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[47]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[48]  Xiaohui Yuan,et al.  Conditional convolution neural network enhanced random forest for facial expression recognition , 2018, Pattern Recognit..

[49]  Peter Kontschieder,et al.  Deep Neural Decision Forests , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[51]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.