Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy

Presents a simultaneous feature weighting and selection method.Uses non-linear information measure with L1 penalty to form the objective function.Experimentally determines the best constraint on weight vectors.Uses MOEA/D to solve the bi-objective optimization problem.Compares results with eminent state-of-the-art techniques. Feature Selection (FS) is an important pre-processing step in machine learning and it reduces the number of features/variables used to describe each member of a dataset. Such reduction occurs by eliminating some of the non-discriminating and redundant features and selecting a subset of the existing features with higher discriminating power among various classes in the data. In this paper, we formulate the feature selection as a bi-objective optimization problem of some real-valued weights corresponding to each feature. A subset of the weighted features is thus selected as the best subset for subsequent classification of the data. Two information theoretic measures, known as relevancy and redundancy are chosen for designing the objective functions for a very competitive Multi-Objective Optimization (MOO) algorithm called Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D). We experimentally determine the best possible constraints on the weights to be optimized. We evaluate the proposed bi-objective feature selection and weighting framework on a set of 15 standard datasets by using the popular k-Nearest Neighbor (k-NN) classifier. As is evident from the experimental results, our method appears to be quite competitive to some of the state-of-the-art FS methods of current interest. We further demonstrate the effectiveness of our framework by changing the choices of the optimization scheme and the classifier to Non-dominated Sorting Genetic Algorithm (NSGA)-II and Support Vector Machines (SVMs) respectively.

[1]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Wei Liu,et al.  Conditional Mutual Information Based Feature Selection , 2008, 2008 International Symposium on Knowledge Acquisition and Modeling.

[3]  Swagatam Das,et al.  Simultaneous feature selection and weighting - An evolutionary multi-objective optimization approach , 2015, Pattern Recognit. Lett..

[4]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[5]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[6]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Fernando Fernández,et al.  Local Feature Weighting in Nearest Prototype Classification , 2008, IEEE Transactions on Neural Networks.

[8]  Mengjie Zhang,et al.  Improving Relevance Measures Using Genetic Programming , 2012, EuroGP.

[9]  Huan Liu,et al.  Feature Selection: An Ever Evolving Frontier in Data Mining , 2010, FSDM.

[10]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[11]  Rossitza Setchi,et al.  Feature selection using Joint Mutual Information Maximisation , 2015, Expert Syst. Appl..

[12]  Yongming Li,et al.  Research of multi-population agent genetic algorithm for feature selection , 2009, Expert Syst. Appl..

[13]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[14]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[15]  Marcos André Gonçalves,et al.  Aggressive and effective feature selection using genetic programming , 2012, 2012 IEEE Congress on Evolutionary Computation.

[16]  B. Chakraborty Feature subset selection by particle swarm optimization with fuzzy fitness function , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[19]  Jerzy J. Korczak,et al.  Genetic Algorithms for Feature Weighting: Evolution vs. Coevolution and Darwin vs. Lamarck , 2005, MICAI.

[20]  Zongben Xu,et al.  A multiobjective ACO algorithm for rough feature selection , 2010, 2010 Second Pacific-Asia Conference on Circuits, Communications and System.

[21]  Mengjie Zhang,et al.  Multi-objective Feature Selection in Classification: A Differential Evolution Approach , 2014, SEAL.

[22]  Bin Yu,et al.  Estimation Stability With Cross-Validation (ESCV) , 2013, 1303.3128.

[23]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[24]  Qingfu Zhang,et al.  The performance of a new version of MOEA/D on CEC09 unconstrained MOP test instances , 2009, 2009 IEEE Congress on Evolutionary Computation.

[25]  Qingfu Zhang,et al.  Multiobjective evolutionary algorithms: A survey of the state of the art , 2011, Swarm Evol. Comput..

[26]  Ahmed Bouridane,et al.  Simultaneous feature selection and feature weighting using Hybrid Tabu Search/K-nearest neighbor classifier , 2007, Pattern Recognit. Lett..

[27]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[28]  Parham Moradi,et al.  Relevance-redundancy feature selection based on ant colony optimization , 2015, Pattern Recognit..

[29]  Amr Badr,et al.  A binary clonal flower pollination algorithm for feature selection , 2016, Pattern Recognit. Lett..

[30]  Nikhil R. Pal,et al.  A Multiobjective Genetic Programming-Based Ensemble for Simultaneous Feature Selection and Classification , 2016, IEEE Transactions on Cybernetics.

[31]  V. Alarcon-Aquino,et al.  Instance Selection and Feature Weighting Using Evolutionary Algorithms , 2006, 2006 15th International Conference on Computing.

[32]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[33]  Wei Du,et al.  Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines , 2003, FEBS letters.

[34]  Antonio Martínez-Álvarez,et al.  Feature selection by multi-objective optimisation: Application to network anomaly detection by hierarchical self-organising maps , 2014, Knowl. Based Syst..

[35]  Wilfried Brauer,et al.  Feature Selection by Means of a Feature Weighting Approach , 1997 .

[36]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[37]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[38]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[39]  Y. Yao,et al.  Information-Theoretic Measures for Knowledge Discovery and Data Mining , 2003 .

[40]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[41]  Kemal Polat,et al.  Pairwise FCM based feature weighting for improved classification of vertebral column disorders , 2014, Comput. Biol. Medicine.

[42]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[43]  Parham Moradi,et al.  An unsupervised feature selection algorithm based on ant colony optimization , 2014, Eng. Appl. Artif. Intell..

[44]  Nikhil R. Pal,et al.  An Integrated Mechanism for Feature Selection and Fuzzy Rule Extraction for Classification , 2012, IEEE Transactions on Fuzzy Systems.

[45]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Mohammad-Reza Feizi-Derakhshi,et al.  Feature selection using Forest Optimization Algorithm , 2016, Pattern Recognit..

[47]  Francisco Herrera,et al.  Integrating a differential evolution feature weighting scheme into prototype generation , 2012, Neurocomputing.