Automatic Clustering Using an Improved Differential Evolution Algorithm

Differential evolution (DE) has emerged as one of the fast, robust, and efficient global search heuristics of current interest. This paper describes an application of DE to the automatic clustering of large unlabeled data sets. In contrast to most of the existing clustering techniques, the proposed algorithm requires no prior knowledge of the data to be classified. Rather, it determines the optimal number of partitions of the data "on the run." Superiority of the new method is demonstrated by comparing it with two recently developed partitional clustering techniques and one popular hierarchical clustering algorithm. The partitional clustering algorithms are based on two powerful well-known optimization algorithms, namely the genetic algorithm and the particle swarm optimization. An interesting real-world application of the proposed method to automatic segmentation of images is also reported.

[1]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[2]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[3]  I. Evangelou,et al.  Data Mining and Knowledge Discovery in Complex Image Data using Artificial Neural Networks , 2001 .

[4]  Ravi Kothari,et al.  On finding the number of clusters , 1999, Pattern Recognit. Lett..

[5]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .

[6]  James C. Bezdek,et al.  Generalized clustering networks and Kohonen's self-organizing scheme , 1993, IEEE Trans. Neural Networks.

[7]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[8]  Mohan M. Trivedi,et al.  Low-Level Segmentation of Aerial Images with Fuzzy Clustering , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[10]  Jan Peters,et al.  Computational Intelligence: Principles, Techniques and Applications , 2007, Comput. J..

[11]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[12]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[14]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[15]  Hans-Paul Schwefel,et al.  Evolution and optimum seeking , 1995, Sixth-generation computer technology series.

[16]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[17]  Robert C. Kohberger,et al.  Cluster Analysis (3rd ed.) , 1994 .

[18]  Amit Konar,et al.  Two improved differential evolution schemes for faster global search , 2005, GECCO '05.

[19]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[20]  C. A. Murthy,et al.  In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[21]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[22]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[23]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[24]  Sanghamitra Bandyopadhyay,et al.  Pattern classification with genetic algorithms , 1995, Pattern Recognit. Lett..

[25]  Sandra Paterlini,et al.  Evolutionary Approaches for Cluster Analysis , 2003 .

[26]  Lawrence W. Lan,et al.  Genetic clustering algorithms , 2001, Eur. J. Oper. Res..

[27]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[28]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[29]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[30]  Roy George,et al.  A variable-length genetic algorithm for clustering and classification , 1995, Pattern Recognit. Lett..

[31]  Andries Petrus Engelbrecht,et al.  Differential evolution methods for unsupervised image classification , 2005, 2005 IEEE Congress on Evolutionary Computation.

[32]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[33]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[34]  B. Jaumard,et al.  Cluster Analysis and Mathematical Programming , 2003 .

[35]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[36]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[37]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[38]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[39]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[40]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[41]  Vijay V. Raghavan,et al.  A clustering strategy based on a formalism of the reproductive process in natural systems , 1979, SIGIR '79.

[42]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[43]  Clark F. Olson,et al.  Parallel Algorithms for Hierarchical Clustering , 1995, Parallel Comput..

[44]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[45]  Steven M. Lalonde,et al.  A First Course in Multivariate Statistics , 1997, Technometrics.

[46]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[47]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[48]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[49]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[51]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[52]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[53]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[54]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[55]  Earl E. Swartzlander,et al.  Introduction to Mathematical Techniques in Pattern Recognition , 1973 .

[56]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[57]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Sankar K. Pal,et al.  Fuzzy sets and decisionmaking approaches in vowel and speaker recognition , 1977 .

[59]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[60]  R. J. Kuo,et al.  Integration of ART2 neural network and genetic K-means algorithm for analyzing Web browsing paths in electronic commerce , 2005, Decis. Support Syst..

[61]  Manish Sarkar,et al.  A clustering algorithm using an evolutionary programming-based approach , 1997, Pattern Recognit. Lett..

[62]  Vijay V. Raghavan,et al.  A clustering strategy based on a formalism of the reproductive process in natural systems , 1979, SIGIR 1979.

[63]  E K Antonsson,et al.  Self-Adapting Vertices for Mask-Layout Synthesis , 2000 .

[64]  Christophe Rosenberger,et al.  Unsupervised clustering method with optimal estimation of the number of clusters: application to image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[65]  Michalis Vazirgiannis,et al.  Clustering validity assessment: finding the optimal partitioning of a data set , 2001, Proceedings 2001 IEEE International Conference on Data Mining.