Margin Trees for High-dimensional Classification

We propose a method for the classification of more than two classes, from high-dimensional features. Our approach is to build a binary decision tree in a top-down manner, using the optimal margin classifier at each split. We implement an exact greedy algorithm for this task, and compare its performance to less greedy procedures based on clustering of the matrix of pairwise margins. We compare the performance of the "margin tree" to the closely related "all-pairs" (one versus one) support vector machine, and nearest centroids on a number of cancer microarray data sets. We also develop a simple method for feature selection. We find that the margin tree has accuracy that is competitive with other methods and offers additional interpretability in its putative grouping of the classes.

[1]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[2]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[3]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  Shumeet Baluja,et al.  Advances in Neural Information Processing , 1994 .

[6]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[7]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[8]  K. Bennett,et al.  A support vector machine approach to decision trees , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[9]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[12]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[13]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Hyunjoong Kim,et al.  Classification Trees With Unbiased Multiway Splits , 2001 .

[15]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[17]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[19]  Kamesh Munagala,et al.  Cancer characterization and feature set extraction by discriminative margin clustering , 2004, BMC Bioinformatics.

[20]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[21]  Ji Zhu,et al.  Margin Maximizing Loss Functions , 2003, NIPS.

[22]  Jennifer G. Dy,et al.  A hierarchical method for multi-class support vector machines , 2004, ICML.

[23]  R. Tibshirani,et al.  Efficient quadratic regularization for expression arrays. , 2004, Biostatistics.

[24]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[25]  Mee Young Park,et al.  Hierarchical Classification using Shrunken Centroids , 2005 .

[26]  A. Osareh,et al.  Classification and Diagnostic Prediction of Cancers Using Gene Microarray Data Analysis , 2009 .