Bacterial colony algorithm with adaptive attribute learning strategy for feature selection in classification of customers for personalized recommendation

Abstract This paper investigates a new bacterial colony-based feature selection algorithm to improve the classification accuracy of customers for personalized products recommendation. An attribute learning strategy is developed in this study to update the feature related population. Specifically, the features can be weighted according to their historic contributions to both the individual- and group-based subsets. Additionally, the frequency of appearance is also recorded for the feature candidates to improve the diversity of feature distribution and avoid the over-fitting. Based on the weight-based feature indexes and frequency of appearance records, the performance of feature subsets are enhanced by replacing the features being repeatedly appeared in a same vector. To explore the feasibility of the proposed method for the missing feature problems, the objective of the optimization is to minimize the classification error using the acceptable number of features. K-Nearest Neighbor is employed as the learning technique to cooperate with the proposed feature selection method. The effectiveness of the proposed feature selection method is demonstrated by performing test on the datasets from UCI machine learning repository and real-world data from Amazon customer reviews of products. Compared with other seven feature selections methods, the proposed feature selection algorithm outperforms the other algorithms by achieving higher classification accuracy rate using smaller features.

[1]  Li Shang,et al.  Feature selection in independent component subspace for microarray data classification , 2006, Neurocomputing.

[2]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[3]  Jianyu Yang,et al.  Object-oriented feature selection of high spatial resolution images using an improved Relief algorithm , 2013, Math. Comput. Model..

[4]  Petros Koumoutsakos,et al.  Optimization based on bacterial chemotaxis , 2002, IEEE Trans. Evol. Comput..

[5]  Daoqiang Zhang,et al.  Pairwise Constraint-Guided Sparse Learning for Feature Selection , 2016, IEEE Transactions on Cybernetics.

[6]  Jianzhong Wang,et al.  Maximum weight and minimum redundancy: A novel framework for feature subset selection , 2013, Pattern Recognit..

[7]  Xuelong Li,et al.  Feature selection with multi-view data: A survey , 2019, Inf. Fusion.

[8]  Wen Jiang,et al.  Random Walk-Based Solution to Triple Level Stochastic Point Location Problem , 2016, IEEE Transactions on Cybernetics.

[9]  Tansel Özyer,et al.  A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction , 2013, PloS one.

[10]  Padmavathi Kora,et al.  Hybrid Bacterial Foraging and Particle Swarm Optimization for detecting Bundle Branch Block , 2015, SpringerPlus.

[11]  Ben Niu,et al.  A novel bacterial algorithm with randomness control for feature selection in classification , 2017, Neurocomputing.

[12]  Min Han,et al.  Feature selection techniques with class separability for multivariate time series , 2013, Neurocomputing.

[13]  Dong Ying Liang,et al.  An Intelligent Feature Selection Method Based on the Bacterial Foraging Algorithm , 2011 .

[14]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[15]  Hong Wang,et al.  Bacterial Colony Optimization , 2012 .

[16]  John E. Moody,et al.  Data Visualization and Feature Selection: New Algorithms for Nongaussian Data , 1999, NIPS.

[17]  Ben Niu,et al.  Bacterial-Inspired Algorithms for Engineering Optimization , 2012, ICIC.

[18]  Chee Peng Lim,et al.  A hybrid model of fuzzy min-max and brain storm optimization for feature selection and data classification , 2019, Neurocomputing.

[19]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  De-Shuang Huang,et al.  Optimized projections for sparse representation based classification , 2013, Neurocomputing.

[21]  Dong Hwa Kim,et al.  Intelligent Feature Selection by Bacterial Foraging Algorithm and Information Theory , 2011, ACN.

[22]  Ben Niu,et al.  A Weighted Bacterial Colony Optimization for Feature Selection , 2014, ICIC.

[23]  Ben Niu,et al.  Novel Bacterial Foraging Optimization with Time-varying Chemotaxis Step , 2011 .

[24]  Jian Ma,et al.  Igf-bagging: Information gain based feature selection for bagging , 2011 .

[25]  Chen Yang,et al.  Erratum to: A multi-objective optimization method based on discrete bacterial algorithm for environmental/economic power dispatch , 2017, Natural Computing.

[26]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[27]  Ben Niu,et al.  Bacterial-inspired feature selection algorithm and its application in fault diagnosis of complex structures , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[28]  Kevin M. Passino,et al.  Biomimicry of bacterial foraging for distributed optimization and control , 2002 .

[29]  Okko Johannes Räsänen,et al.  Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech , 2013, INTERSPEECH.

[30]  Benjamin Schrauwen,et al.  Optimized Parameter Search for Large Datasets of the Regularization Parameter and Feature Selection for Ridge Regression , 2013, Neural Processing Letters.

[31]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.