A novel bacterial algorithm with randomness control for feature selection in classification

Feature selection (FS) is an essential data-processing technique to reduce the number of features and improve the classification performance, but it is also a challenging problem because of the large search space and complex interactions between features. Bacterial based algorithms (BAs) are effective population based techniques known for their global searching capability. This paper proposes a novel bacterial algorithm based on control mechanisms and modified population updating strategies for feature selection in classification. The proposed new method, abbreviated as BAFS, employs three parameters to control the randomness of the population and reduce the computational complexity by avoiding the redundant searching for the optimal. To make the solutions suitable for feature selection, the strategies of reproduction and elimination are modified according to the classification performance and occurrence of features, respectively. Feature distribution is measured by the probability that features are appeared in the most promising subsets. The proposed bacterial based feature selection algorithm is used for selecting the best feature subsets on datasets with varying dimensionality. Comparison studies on five bacterial based algorithms indicate that the proposed BAFS outperforms other algorithms by achieving higher classification performance.

[1]  Sung-Bae Cho,et al.  An Evolutionary Algorithm Approach to Optimal Ensemble Classifiers for DNA Microarray Data Analysis , 2008, IEEE Transactions on Evolutionary Computation.

[2]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[3]  Jon Atli Benediktsson,et al.  A Novel Feature Selection Approach Based on FODPSO and SVM , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Jian Ma,et al.  Igf-bagging: Information gain based feature selection for bagging , 2011 .

[5]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[6]  Mansour Sheikhan,et al.  Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection , 2011, Neural Computing and Applications.

[7]  Li-Yeh Chuang,et al.  IG-GA: A Hybrid Filter/Wrapper Method for Feature Selection of Microarray Data , 2010 .

[8]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[9]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part I , 2014, IEEE Transactions on Evolutionary Computation.

[10]  Hong Wang,et al.  Bacterial Colony Optimization , 2012 .

[11]  Adel Al-Jumaily,et al.  Feature subset selection using differential evolution and a statistical repair mechanism , 2011, Expert Syst. Appl..

[12]  Ben Niu,et al.  A Weighted Bacterial Colony Optimization for Feature Selection , 2014, ICIC.

[13]  Gang Wang,et al.  Multiple parameter control for ant colony optimization applied to feature selection problem , 2015, Neural Computing and Applications.

[14]  Dong Hwa Kim,et al.  Intelligent Feature Selection by Bacterial Foraging Algorithm and Information Theory , 2011, ACN.

[15]  Mengjie Zhang,et al.  Binary PSO and Rough Set Theory for Feature Selection: a Multi-objective filter Based Approach , 2014, Int. J. Comput. Intell. Appl..

[16]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[17]  Zhiyong Zeng,et al.  Feature Selection Based on Dependency Margin , 2015, IEEE Transactions on Cybernetics.

[18]  Tansel Özyer,et al.  A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction , 2013, PloS one.

[19]  Kazuyuki Murase,et al.  A new hybrid ant colony optimization algorithm for feature selection , 2012, Expert Syst. Appl..

[20]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[21]  Kevin M. Passino,et al.  Biomimicry of bacterial foraging for distributed optimization and control , 2002 .

[22]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Mengjie Zhang,et al.  Differential evolution (DE) for multi-objective feature selection in classification , 2014, GECCO.

[24]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[25]  Jianyu Yang,et al.  Object-oriented feature selection of high spatial resolution images using an improved Relief algorithm , 2013, Math. Comput. Model..

[26]  Huan Liu,et al.  Searching for interacting features in subset selection , 2009, Intell. Data Anal..

[27]  Mohammad Reza Keyvanpour,et al.  A NOVEL EMBEDDED FEATURE SELECTION METHOD: A COMPARATIVE STUDY IN THE APPLICATION OF TEXT CATEGORIZATION , 2013, Appl. Artif. Intell..

[28]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[29]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[30]  Yafei Zhang,et al.  Dynamic Adaboost learning with feature selection based on parallel genetic algorithm for image annotation , 2010, Knowl. Based Syst..

[31]  Jianjiang Lu,et al.  Feature selection based-on genetic algorithm for image annotation , 2008, Knowl. Based Syst..

[32]  Petros Koumoutsakos,et al.  Optimization based on bacterial chemotaxis , 2002, IEEE Trans. Evol. Comput..

[33]  Minghao Yin,et al.  Multiobjective Binary Biogeography Based Optimization for Feature Selection Using Gene Expression Data , 2013, IEEE Transactions on NanoBioscience.

[34]  Ben Niu,et al.  Novel Bacterial Foraging Optimization with Time-varying Chemotaxis Step , 2011 .

[35]  Grzegorz Dudek,et al.  An Artificial Immune System for Classification With Local Feature Selection , 2012, IEEE Transactions on Evolutionary Computation.

[36]  Dong Ying Liang,et al.  An Intelligent Feature Selection Method Based on the Bacterial Foraging Algorithm , 2011 .

[37]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[38]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[39]  Mansour Sheikhan,et al.  Time series prediction using PSO-optimized neural network and hybrid feature selection algorithm for IEEE load data , 2012, Neural Computing and Applications.

[40]  Li-Yeh Chuang,et al.  Improved binary PSO for feature selection using gene expression data , 2008, Comput. Biol. Chem..

[41]  Michael D. Todd,et al.  Automated Feature Design for Numeric Sequence Classification by Genetic Programming , 2015, IEEE Transactions on Evolutionary Computation.

[42]  Nikhil R. Pal,et al.  A Multiobjective Genetic Programming-Based Ensemble for Simultaneous Feature Selection and Classification , 2016, IEEE Transactions on Cybernetics.