Feature selection for high-dimensional multi-category data using PLS-based local recursive feature elimination

This paper focuses on high-dimensional and ultrahigh dimensional multi-category problems and presents a feature selection framework based on local recursive feature elimination (Local-RFE). Using this analytical framework, we propose a new feature selection algorithm, PLS-based local-RFE (LRFE-PLS). In order to compare the effectiveness of the proposed methodology, we also present PLS-based Global-RFE which takes all categories into consideration simultaneously. The advantage of the proposed algorithms lies in the fact that PLS-based feature ranking can quickly delete irrelevant features and RFE can concurrently remove some redundant features. As a result, the selected feature subset is more compact. In this paper the proposed algorithms are compared to some state-of-the-art methods using multiple datasets. Experimental results show that the proposed algorithms are competitive and work effectively for high-dimensional multi-category data. Statistical tests of significance show that LRFE-PLS algorithm has better performance. The proposed algorithms can be effectively applied not only to microarray data analysis but also to image recognition and financial data analysis.

[1]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[2]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[3]  C. Furlanello,et al.  Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products , 2006 .

[4]  Danh V. Nguyen,et al.  Multi-class cancer classification via partial least squares with gene expression profiles , 2002, Bioinform..

[5]  Zijiang Yang,et al.  PLS-Based Gene Selection and Identification of Tumor-Specific Genes , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[7]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[8]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[9]  Jagath C. Rajapakse,et al.  One-Versus-One and One-Versus-All Multiclass SVM-RFE for Gene Selection in Cancer Classification , 2007, EvoBIO.

[10]  Xuesong Lu,et al.  Significance of Gene Ranking for Classification of Microarray Samples , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  A. Boulesteix PLS Dimension Reduction for Classification with Microarray Data , 2004, Statistical applications in genetics and molecular biology.

[12]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[13]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[14]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[17]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[18]  Daoxiong Gong,et al.  Tumor-specific gene expression patterns with gene expression profiles , 2006, Science in China Series C.

[19]  Francisco Herrera,et al.  An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes , 2011, Pattern Recognit..

[20]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[21]  Kim-Anh Lê Cao,et al.  Multiclass classification and gene selection with a stochastic algorithm , 2009, Comput. Stat. Data Anal..

[22]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[23]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[24]  Zijiang Yang,et al.  Using partial least squares and support vector machines for bankruptcy prediction , 2011, Expert Syst. Appl..

[25]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[26]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[27]  Feng Chu,et al.  A General Wrapper Approach to Selection of Class-Dependent Features , 2008, IEEE Transactions on Neural Networks.

[28]  Anne-Laure Boulesteix,et al.  Partial least squares: a versatile tool for the analysis of high-dimensional genomic data , 2006, Briefings Bioinform..

[29]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  José Manuel Benítez,et al.  Empirical study of feature selection methods based on individual feature evaluation for classification problems , 2011, Expert Syst. Appl..

[31]  Pablo M. Granitto,et al.  Feature selection on wide multiclass problems using OVA-RFE , 2010, Inteligencia Artif..

[32]  Myong Kee Jeong,et al.  Support vector-based feature selection using Fisher's linear discriminant and Support Vector Machine , 2010, Expert Syst. Appl..

[33]  Xin Zhou,et al.  MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data , 2007, Bioinform..

[34]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[35]  L. J. Wei,et al.  Asymptotic Conservativeness and Efficiency of Kruskal-Wallis Test for K Dependent Samples , 1981 .

[36]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[37]  S. Wold,et al.  PLS: Partial Least Squares Projections to Latent Structures , 1993 .

[38]  Anne-Laure Boulesteix,et al.  Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value , 2008, Bioinform..

[39]  Sinisa Todorovic,et al.  Local-Learning-Based Feature Selection for High-Dimensional Data Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.