A Bacterial Evolutionary Algorithm for automatic data clustering

This paper describes an evolutionary clustering algorithm, which can partition a given dataset automatically into the optimal number of groups through one shot of optimization. The proposed method is based on an evolutionary computing technique known as the Bacterial Evolutionary Algorithm (BEA). The BEA draws inspiration from a biological phenomenon of microbial evolution. Unlike the conventional mutation, crossover and selection operaions in a GA (Genetic Algorithm), BEA incorporates two special operations for evolving its population, namely the bacterial mutation and the gene transfer operation. In the present context, these operations have been modified so as to handle the variable lengths of the chromosomes that encode different cluster groupings. Experiments were done with several synthetic as well as real life data sets including a remote sensing satellite image data. The results estabish the superiority of the proposed approach in terms of final accuracy.

[1]  Francisco Herrera,et al.  A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 Special Session on Real Parameter Optimization , 2009, J. Heuristics.

[2]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[5]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[9]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[10]  Ajith Abraham,et al.  Swarm Intelligence Algorithms for Data Clustering , 2008, Soft Computing for Knowledge Discovery and Data Mining.

[11]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[12]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[14]  Amit Konar,et al.  Metaheuristic Clustering , 2009, Studies in Computational Intelligence.

[15]  Amit Konar,et al.  Metaheuristic Pattern Clustering – An Overview , 2009 .

[16]  Takeshi Furuhashi,et al.  Fuzzy system parameters discovery by bacterial evolutionary algorithm , 1999, IEEE Trans. Fuzzy Syst..

[17]  HerreraFrancisco,et al.  A study on the use of non-parametric tests for analyzing the evolutionary algorithms' behaviour , 2009 .

[18]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .

[19]  James C. Bezdek,et al.  Generalized clustering networks and Kohonen's self-organizing scheme , 1993, IEEE Trans. Neural Networks.

[20]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[21]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.