Evolutionary feature subspaces generation for ensemble classification

Ensemble learning is a powerful machine learning paradigm which leverages a collection of diverse base learners to achieve better prediction performance than that could be achieved by any individual base learner. This work proposes an evolutionary feature subspaces generation based ensemble learning framework, which formulates the tasks of searching for the most suitable feature subspace for each base learner into a multi-task optimization problem and solve it via an evolutionary multi-task optimizer. Multiple such problems which correspond to different base learners are solved simultaneously via an evolutionary multi-task feature selection algorithm such that solving one problem may help solve some other problems via implicit knowledge transfer. The quality of thus generated feature subspaces is supposed to outperform those obtained by individually seeking the optimal feature subspace for each base learner. We implement the proposed framework by using SVM, KNN, and decision tree as the base learners, proposing a multi-task binary particle swarm optimization algorithm for evolutionary multi-task feature selection, and utilizing the major voting scheme to combine the outputs of the base learners. Experiments on several UCI datasets demonstrate the effectiveness of the proposed method.

[1]  Juan José Rodríguez Diez,et al.  Random Subspace Ensembles for fMRI Classification , 2010, IEEE Transactions on Medical Imaging.

[2]  Peijun Du,et al.  Random Subspace Ensembles for Hyperspectral Image Classification With Extended Morphological Attribute Profiles , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Rolls-Royce,et al.  Measuring Complementarity between Function Landscapes in Evolutionary Multitasking , 2016 .

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[6]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[7]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[8]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[9]  Yuhui Shi,et al.  cuSaDE: A CUDA-Based Parallel Self-adaptive Differential Evolution Algorithm , 2015 .

[10]  Yew-Soon Ong,et al.  Multifactorial Evolution: Toward Evolutionary Multitasking , 2016, IEEE Transactions on Evolutionary Computation.

[11]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[12]  Giorgio Valentini,et al.  Bio-molecular cancer prediction with random subspace ensembles of support vector machines , 2005, Neurocomputing.

[13]  Yunming Ye,et al.  Stratified sampling for feature subspace selection in random forests for high dimensional data , 2013, Pattern Recognit..

[14]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[15]  Shu-Tao Xia,et al.  A novel feature subspace selection method in random forests for high dimensional data , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[16]  A. Kai Qin,et al.  An improved CUDA-based implementation of differential evolution on GPU , 2012, GECCO '12.

[17]  Xiaodong Li,et al.  Differential evolution on the CEC-2013 single-objective continuous optimization testbed , 2013, 2013 IEEE Congress on Evolutionary Computation.

[18]  Jürgen Branke,et al.  Multi-swarm Optimization in Dynamic Environments , 2004, EvoWorkshops.

[19]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  P.N. Suganthan,et al.  Generalized null space uncorrelated Fisher discriminant analysis for linear dimensionality reduction , 2006, Pattern Recognit..

[21]  Xiaodong Li,et al.  Effects of population initialization on differential evolution for large scale optimization , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[22]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[23]  R. Eberhart,et al.  Empirical study of particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[24]  Shih-Wei Lin,et al.  PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis , 2009, Appl. Soft Comput..

[25]  Jing J. Liang,et al.  Dynamic multi-swarm particle swarm optimizer with local search for Large Scale Global Optimization , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[26]  Erdem Bilgili,et al.  Random subspace method with class separability weighting , 2016, Expert Syst. J. Knowl. Eng..

[27]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[28]  Rohitash Chandra,et al.  Evolutionary Multi-task Learning for Modular Training of Feedforward Neural Networks , 2016, ICONIP.

[29]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[30]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[32]  Li-Yeh Chuang,et al.  Improved binary particle swarm optimization using catfish effect for feature selection , 2011, Expert Syst. Appl..

[33]  Ludmila I. Kuncheva,et al.  Choosing Parameters for Random Subspace Ensembles for fMRI Classification , 2010, MCS.

[34]  Sotiris B. Kotsiantis,et al.  Combining bagging, boosting, rotation forest and random subspace methods , 2011, Artificial Intelligence Review.

[35]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[36]  Gavin Brown,et al.  Ensemble Learning , 2010, Encyclopedia of Machine Learning and Data Mining.

[37]  Jing J. Liang,et al.  Performance Evaluation of Multiagent Genetic Algorithm , 2006, Natural Computing.

[38]  David W. Opitz,et al.  Feature Selection for Ensembles , 1999, AAAI/IAAI.

[39]  Ludmila I. Kuncheva,et al.  Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data , 2012, Pattern Recognit..

[40]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[41]  Terry Windeatt,et al.  Bootstrap Feature Selection for Ensemble Classifiers , 2010, ICDM.

[42]  Luiz Eduardo Soares de Oliveira,et al.  Feature selection for ensembles:a hierarchical multi-objective genetic algorithm approach , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[43]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[44]  Xiaodong Li,et al.  Initialization methods for large scale global optimization , 2013, 2013 IEEE Congress on Evolutionary Computation.