On the fusion of threshold classifiers for categorization and dimensionality reduction

We study ensembles of simple threshold classifiers for the categorization of high-dimensional data of low cardinality and give a compression bound on their prediction risk. Two approaches are utilized to produce such classifiers. One is based on univariate feature selection employing the area under the ROC curve as ranking criterion. The other approach uses a greedy selection strategy. The methods are applied to artificial data, published microarray expression profiles, and highly imbalanced data.

[1]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[2]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[3]  Hans A. Kestler,et al.  Specialized DNA Arrays for the Differentiation of Pancreatic Tumors , 2005, Clinical Cancer Research.

[4]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[7]  François Laviolette,et al.  Learning the set covering machine by bound minimization and margin-sparsity trade-off , 2009, Machine Learning.

[8]  Rocco A. Servedio,et al.  Toward Attribute Efficient Learning of Decision Lists and Parities , 2006, J. Mach. Learn. Res..

[9]  David A. McAllester PAC-Bayesian model averaging , 1999, COLT '99.

[10]  Mario Marchand,et al.  PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data , 2004, NIPS.

[11]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[12]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[13]  John Shawe-Taylor,et al.  PAC-Bayesian Compression Bounds on the Prediction Error of Learning Algorithms for Classification , 2005, Machine Learning.

[14]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[15]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Manfred K. Warmuth,et al.  Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension , 1995, Machine Learning.

[17]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[18]  John Shawe-Taylor,et al.  The Set Covering Machine , 2003, J. Mach. Learn. Res..

[19]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[20]  Manfred K. Warmuth,et al.  Relating Data Compression and Learnability , 2003 .

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[23]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[24]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[25]  Wolfgang Huber,et al.  A Compendium to Ensure Computational Reproducibility in High-Dimensional Classification Tasks , 2004, Statistical applications in genetics and molecular biology.

[26]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  François Laviolette,et al.  Margin-Sparsity Trade-Off for the Set Covering Machine , 2005, ECML.

[30]  John Langford,et al.  PAC-MDL Bounds , 2003, COLT.

[31]  Marcel J. T. Reinders,et al.  A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets , 2006, BMC Bioinformatics.

[32]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.