Feature Selection Based on Grouped Sorting

As an effective dimensionality reduction method, feature selection can remove the irrelevant variables and increase the accuracy in machine learning. In this paper, a feature selection method based on grouped sorting is proposed to solve the problem of high-dimensional data processing. As in this work, feature grouping is firstly carried out with the redundancy between the features as the group criteria and then we sort the features in each group by the classifying capacity. Finally we select the feature of the front rank in each group to constitute a new feature space. Through this, the redundant and irrelevant features are removed. We test the proposed method in numerical experiments on several data sets by different typical classifiers. The experimental results suggest that the method is effective.

[1]  Pasi Luukka,et al.  Feature selection using fuzzy entropy measures with similarity classifier , 2011, Expert Syst. Appl..

[2]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Chein-I Chang Data Dimensionality Reduction , 2013 .

[4]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[5]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6]  Alex Pappachen James,et al.  Ranked selection of nearest discriminating features , 2012, Human-centric Computing and Information Sciences.

[7]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[8]  Jose Miguel Puerta,et al.  Incremental Wrapper-based subset Selection with replacement: An advantageous alternative to sequential forward selection , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[9]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[10]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[11]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[12]  Stanley J. Reeves An improved sequential backward selection algorithm for large-scale observation selection problems , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[13]  Brijesh Verma,et al.  Neural vs. statistical classifier in conjunction with genetic algorithm based feature selection , 2005, Pattern Recognit. Lett..

[14]  E. L. Lawler,et al.  Branch-and-Bound Methods: A Survey , 1966, Oper. Res..

[15]  Reza Boostani,et al.  Selection of relevant features for EEG signal classification of schizophrenic patients , 2007, Biomed. Signal Process. Control..

[16]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[18]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .