Kernel based automatic clustering using modified particle swarm optimization algorithm

This paper introduces a method for clustering complex and linearly non-separable datasets, without any prior knowledge of the number of naturally occurring clusters. The proposed method is based on an improved variant of the Particle Swarm Optimization (PSO) algorithm. In addition, it employs a kernel-induced similarity measure instead of the conventional sum-of-squares distance. Use of the kernel function makes it possible to cluster data that is linearly non-separable in the original input space into homogeneous groups in a transformed high-dimensional feature space. Computer simulations have been undertaken with a test bench of five synthetic and three real life datasets, in order to compare the performance of the proposed method with a few state-of-the-art clustering algorithms. The results reflect the superiority of the proposed algorithm in terms of accuracy, convergence speed and robustness.

[1]  C. A. Murthy,et al.  In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[2]  Sandra Paterlini,et al.  Evolutionary Approaches for Cluster Analysis , 2003 .

[3]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[4]  Doheon Lee,et al.  A kernel-based subtractive clustering method , 2005, Pattern Recognit. Lett..

[5]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[6]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[7]  Dao-Qiang Zhang,et al.  Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm , 2003, Neural Processing Letters.

[8]  I. Evangelou,et al.  Data Mining and Knowledge Discovery in Complex Image Data using Artificial Neural Networks , 2001 .

[9]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[10]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[11]  DebK.,et al.  A fast and elitist multiobjective genetic algorithm , 2002 .

[12]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[13]  Francesco Masulli,et al.  Soft Computing Applications , 2003 .

[14]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[15]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[16]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[17]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[18]  Rong Zhang,et al.  A large scale clustering scheme for kernel K-Means , 2002, Object recognition supported by user interaction for service robots.

[19]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[20]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[21]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[22]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[23]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[24]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[25]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[26]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[27]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[28]  Earl E. Swartzlander,et al.  Introduction to Mathematical Techniques in Pattern Recognition , 1973 .

[29]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[30]  Manish Sarkar,et al.  A clustering algorithm using an evolutionary programming-based approach , 1997, Pattern Recognit. Lett..

[31]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[32]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Christophe Rosenberger,et al.  Unsupervised clustering method with optimal estimation of the number of clusters: application to image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[34]  Michael K. Ng,et al.  A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..

[35]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[36]  Parag M. Kanade,et al.  Fuzzy ants as a clustering concept , 2003, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003.

[37]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .