A decision rule-based method for feature selection in predictive data mining

Algorithms for feature selection in predictive data mining for classification problems attempt to select those features that are relevant, and are not redundant for the classification task. A relevant feature is defined as one which is highly correlated with the target function. One problem with the definition of feature relevance is that there is no universally accepted definition of what it means for a feature to be 'highly correlated with the target function or highly correlated with the other features'. A new feature selection algorithm which incorporates domain specific definitions of high, medium and low correlations is proposed in this paper. The proposed algorithm conducts a heuristic search for the most relevant features for the prediction task.

[1]  Padhraic Smyth,et al.  Data Mining at the Interface of Computer Science and Statistics , 2001 .

[2]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[3]  Gérard Dreyfus,et al.  Ranking a Random Feature for Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[5]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[6]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[7]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[8]  David W. Aha,et al.  A Comparative Evaluation of Sequential Feature Selection Algorithms , 1995, AISTATS.

[9]  P. Langley,et al.  Average-case analysis of a nearest neighbor algorthim , 1993, IJCAI 1993.

[10]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Madhu Chetty,et al.  Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets , 2006, Data Mining and Knowledge Discovery.

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[14]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[15]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[16]  Judea Pearl,et al.  Heuristics : intelligent search strategies for computer problem solving , 1984 .

[17]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..