Swarm Intelligence Algorithms for Data Clustering

Clustering aims at representing large datasets by a fewer number of prototypes or clusters. It brings simplicity in modeling data and thus plays a central role in the process of knowledge discovery and data mining. Data mining tasks, in these days, require fast and accurate partitioning of huge datasets, which may come with a variety of attributes or features. This, in turn, imposes severe computational requirements on the relevant clustering techniques. A family of bio-inspired algorithms, well-known as Swarm Intelligence (SI) has recently emerged that meets these requirements and has successfully been applied to a number of real world clustering problems. This chapter explores the role of SI in clustering different kinds of datasets. It finally describes a new SI technique for partitioning any dataset into an optimal number of groups through one run of optimization. Computer simulations undertaken in this research have also been provided to demonstrate the effectiveness of the proposed algorithm.

[1]  Pascale Kuntz,et al.  Emergent colonization and graph partitioning , 1994 .

[2]  Amit Konar,et al.  Spatial Information Based Image Segmentation Using a Modified Particle Swarm Optimization Algorithm , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[3]  D. E. Goldberg,et al.  Optimization and Machine Learning , 2022 .

[4]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining , 2009, Swarm Intelligence in Data Mining.

[5]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  Manish Sarkar,et al.  A clustering algorithm using an evolutionary programming-based approach , 1997, Pattern Recognit. Lett..

[8]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[9]  Sam Kwong,et al.  Ant Colony Clustering and Feature Extraction for Anomaly Intrusion Detection , 2006, Swarm Intelligence in Data Mining.

[10]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[11]  B L Partridge,et al.  The structure and function of fish schools. , 1982, Scientific American.

[12]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[13]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[14]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Jing Wang,et al.  Swarm Intelligence in Cellular Robotic Systems , 1993 .

[16]  T. M. Lillesand,et al.  Remote Sensing and Image Interpretation , 1980 .

[17]  Christopher G. Langton,et al.  Artificial Life III , 2000 .

[18]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  I. Couzin,et al.  Collective memory and spatial sorting in animal groups. , 2002, Journal of theoretical biology.

[20]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[21]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[22]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[23]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[25]  Weng-Kin Lai,et al.  Homogeneous Ants for Web Document Similarity Modeling and Categorization , 2002, Ant Algorithms.

[26]  Marco Dorigo,et al.  Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1d-som , 2003 .

[27]  Pedro Pina,et al.  Self-Organized Data and Image Retrieval as a Consequence of Inter-Dynamic Synergistic Relationships in Artificial Ant Colonies , 2002, HIS.

[28]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[29]  Xiaohui Cui,et al.  Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm , 2005 .

[30]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[31]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[32]  Pascale Kuntz,et al.  A Stochastic Heuristic for Visualising Graph Clusters in a Bi-Dimensional Space Prior to Partitioning , 1999, J. Heuristics.

[33]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[34]  Baldo Faieta,et al.  Diversity and adaptation in populations of clustering ants , 1994 .

[35]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[36]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[38]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[39]  James C. Bezdek,et al.  Partially supervised clustering for image segmentation , 1996, Pattern Recognit..

[40]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[41]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[42]  Parag M. Kanade,et al.  Fuzzy ants as a clustering concept , 2003, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003.

[43]  B. Jaumard,et al.  Cluster Analysis and Mathematical Programming , 2003 .

[44]  M.C. Clark,et al.  MRI segmentation using fuzzy clustering techniques , 1994, IEEE Engineering in Medicine and Biology Magazine.

[45]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[46]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[47]  Yadong Wang,et al.  Improving fuzzy c-means clustering based on feature-weight learning , 2004, Pattern Recognit. Lett..

[48]  Mohan M. Trivedi,et al.  Low-Level Segmentation of Aerial Images with Fuzzy Clustering , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[49]  Javier Ruiz-del-Solar,et al.  Soft computing systems : design, management and applications , 2002 .

[50]  Jan Peters,et al.  Computational Intelligence: Principles, Techniques and Applications , 2007, Comput. J..

[51]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[52]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[53]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[54]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[55]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[56]  D. Snyers,et al.  New results on an ant-based heuristic for highlighting the organization of large graphs , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[57]  Sandra Paterlini,et al.  Evolutionary Approaches for Cluster Analysis , 2003 .

[58]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[59]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[60]  Juan Julián Merelo Guervós,et al.  Self-Organized Stigmergic Document Maps: Environment as a Mechanism for Context Learning , 2004, ArXiv.

[61]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[62]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[63]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[64]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[65]  P. Brucker On the Complexity of Clustering Problems , 1978 .

[66]  T. Pitcher,et al.  The sensory basis of fish schools: Relative roles of lateral line and vision , 1980, Journal of comparative physiology.

[67]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .

[68]  Ujjwal Maulik,et al.  A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification , 2005, Fuzzy Sets Syst..

[69]  James C. Bezdek,et al.  Generalized clustering networks and Kohonen's self-organizing scheme , 1993, IEEE Trans. Neural Networks.

[70]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  Baldo Faieta,et al.  Exploratory database analysis via self-organization , 1994 .

[72]  Sankar K. Pal,et al.  Data mining in soft computing framework: a survey , 2002, IEEE Trans. Neural Networks.

[73]  Mark M. Millonas,et al.  Swarms, Phase Transitions, and Collective Intelligence , 1993, adap-org/9306002.

[74]  Gilles Venturini,et al.  Data and Text Mining with Hierarchical Clustering Ants , 2006, Swarm Intelligence in Data Mining.

[75]  Julia Handl,et al.  Improved Ant-Based Clustering and Sorting , 2002, PPSN.

[76]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[77]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[78]  Francesco Masulli,et al.  Soft Computing Applications , 2003 .

[79]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining (Studies in Computational Intelligence) , 2006 .

[80]  R. R. Krausz Living in Groups , 2013 .

[81]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[82]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[83]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[84]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[85]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[86]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[87]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[88]  Andries P. Engelbrecht,et al.  Image Classification using Particle Swarm Optimization , 2002, SEAL.

[89]  Michalis Vazirgiannis,et al.  Clustering validity assessment: finding the optimal partitioning of a data set , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[90]  L. Dill,et al.  The three-dimensional structure of airborne bird flocks , 1978, Behavioral Ecology and Sociobiology.

[91]  Hans-Paul Schwefel,et al.  Evolution and optimum seeking , 1995, Sixth-generation computer technology series.

[92]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[93]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[94]  Jean-Arcady Meyer,et al.  From Animals to Animats: Proceedings of The First International Conference on Simulation of Adaptive Behavior (Complex Adaptive Systems) , 1990 .

[95]  Andries Petrus Engelbrecht,et al.  Differential evolution methods for unsupervised image classification , 2005, 2005 IEEE Congress on Evolutionary Computation.

[96]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[97]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[98]  E K Antonsson,et al.  Self-Adapting Vertices for Mask-Layout Synthesis , 2000 .

[99]  James C. Bezdek,et al.  Clustering with a genetically optimized approach , 1999, IEEE Trans. Evol. Comput..

[100]  Christophe Rosenberger,et al.  Unsupervised clustering method with optimal estimation of the number of clusters: application to image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.