Genetic Approach to Feature Selection for Ensemble Creation

Ensembles of classifiers have been shown to be very effective for case-based classification tasks. The vast majority of ensemble construction algorithms use the complete set of features available in the problem domain for the ensemble creation. Recent work on randomly selected subspaces for ensemble construction has been shown to improve the accuracy of the ensemble considerably. In this paper we focus our attention on feature selection for ensemble creation using a genetic search approach. We compare boosting and bagging techniques using three approaches for feature selection for ensemble construction. Our genetic-based method produces more reliable ensembles and up to 80% in memory reduction on the datasets employed.

[1]  R. Cranley,et al.  Multivariate Analysis—Methods and Applications , 1985 .

[2]  Fionn Murtagh,et al.  Multivariate analysis methods: Background and example , 1988 .

[3]  Larry J. Eshelman,et al.  The CHC Adaptive Search Algorithm: How to Have Safe Search When Engaging in Nontraditional Genetic Recombination , 1990, FOGA.

[4]  Richard J. Enbody,et al.  Further Research on Feature Selection and Classification Using Genetic Algorithms , 1993, ICGA.

[5]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[6]  Richard L. Bankert,et al.  Cloud Classification of AVHRR Imagery in Maritime Regions Using a Probabilistic Neural Network , 1994 .

[7]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[8]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[9]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[10]  David W. Aha,et al.  A Comparative Evaluation of Sequential Feature Selection Algorithms , 1995, AISTATS.

[11]  Jerzy W. Bala,et al.  Hybrid Learning Using Genetic Algorithms and Decision Trees for Pattern Classification , 1995, IJCAI.

[12]  Peter D. Turney How to Shift Bias: Lessons from the Baldwin Effect , 1996, Evolutionary Computation.

[13]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[14]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[15]  Jerzy W. Bala,et al.  Visual routine for eye detection using hybrid genetic architectures , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16]  David W. Aha,et al.  Improvement to a Neural Network Cloud Classifier , 1996 .

[17]  David W. Opitz,et al.  An Empirical Evaluation of Bagging and Boosting , 1997, AAAI/IAAI.

[18]  Darrell Whitley,et al.  Genetic Search for Feature Subset Selection: A Comparison Between CHC and GENESIS , 1998 .

[19]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.