Metaheuristic Pattern Clustering – An Overview

This chapter provides a comprehensive overview to the data clustering techniques, based on naturally-inspired metaheuristic algorithms. At first the clustering problem, similarity and dissimilarity measures between patterns and the methods of cluster validation are presented in a formal way. A few classical clustering algorithms are also addressed. The chapter then discusses the relevance of population-based approach with a focus on evolutionary computing in pattern clustering and outlines the most promising evolutionary clustering methods. The chapter ends with a discussion on the automatic clustering problem, which remains largely unsolved by most of the traditional clustering algorithms.

[1]  C. A. Murthy,et al.  In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[2]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[3]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[4]  Lawrence J. Fogel,et al.  Artificial Intelligence through Simulated Evolution , 1966 .

[5]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[6]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[7]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Yadong Wang,et al.  Improving fuzzy c-means clustering based on feature-weight learning , 2004, Pattern Recognit. Lett..

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  Sanghamitra Bandyopadhyay,et al.  Pattern classification with genetic algorithms , 1995, Pattern Recognit. Lett..

[11]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[12]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[13]  Joshua D. Knowles,et al.  Multiobjective clustering around medoids , 2005, 2005 IEEE Congress on Evolutionary Computation.

[14]  Sandra Paterlini,et al.  Evolutionary Approaches for Cluster Analysis , 2003 .

[15]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  José Carlos Príncipe,et al.  A Markov Chain Framework for the Simple Genetic Algorithm , 1993, Evolutionary Computation.

[17]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[18]  Martin J. Oates,et al.  PESA-II: region-based selection in evolutionary multiobjective optimization , 2001 .

[19]  Anil K. Jain,et al.  Adaptive clustering ensembles , 2004, ICPR 2004.

[20]  R. Storn,et al.  Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series) , 2005 .

[21]  Lawrence W. Lan,et al.  Genetic clustering algorithms , 2001, Eur. J. Oper. Res..

[22]  P. Brucker On the Complexity of Clustering Problems , 1978 .

[23]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[24]  Juan Julián Merelo Guervós,et al.  Self-Organized Stigmergic Document Maps: Environment as a Mechanism for Context Learning , 2004, ArXiv.

[25]  Andries P. Engelbrecht,et al.  Image Classification using Particle Swarm Optimization , 2002, SEAL.

[26]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[27]  Robert C. Kohberger,et al.  Cluster Analysis (3rd ed.) , 1994 .

[28]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[29]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[30]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .

[31]  Ujjwal Maulik,et al.  Multiobjective Genetic Clustering for Pixel Classification in Remote Sensing Imagery , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[32]  Nikhil R. Pal,et al.  Cluster validation using graph theoretic concepts , 1997, Pattern Recognit..

[33]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[34]  Ujjwal Maulik,et al.  A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification , 2005, Fuzzy Sets Syst..

[35]  Chien-Hsing Chou,et al.  Short Papers , 2001 .

[36]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[37]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[38]  Ujjwal Maulik,et al.  An evolutionary technique based on K-Means algorithm for optimal clustering in RN , 2002, Inf. Sci..

[39]  James C. Bezdek,et al.  Generalized clustering networks and Kohonen's self-organizing scheme , 1993, IEEE Trans. Neural Networks.

[40]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[41]  Ujjwal Maulik,et al.  Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification , 2003, IEEE Trans. Geosci. Remote. Sens..

[42]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[43]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[44]  D. Dasgupta Artificial Immune Systems and Their Applications , 1998, Springer Berlin Heidelberg.

[45]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Abhijit S. Pandya,et al.  Pattern Recognition with Neural Networks in C++ , 1995 .

[47]  Kishan G. Mehrotra,et al.  Elements of artificial neural networks , 1996 .

[48]  Joshua D. Knowles,et al.  Exploiting the Trade-off - The Benefits of Multiple Objectives in Data Clustering , 2005, EMO.

[49]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[50]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[51]  David S. Johnson,et al.  The Traveling Salesman Problem: A Case Study in Local Optimization , 2008 .

[52]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[53]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[54]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[55]  Parag M. Kanade,et al.  Fuzzy ants as a clustering concept , 2003, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003.

[56]  Greg Hamerly,et al.  Alternatives to the k-means algorithm that find better clusterings , 2002, CIKM '02.

[57]  Ryszard S. Michalski,et al.  The LEM3 implementation of learnable evolution model and its testing on complex function optimization problems , 2006, GECCO.

[58]  Hans-Georg Beyer,et al.  The Theory of Evolution Strategies , 2001, Natural Computing Series.

[59]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[60]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[61]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[62]  Michalis Vazirgiannis,et al.  Clustering validity assessment: finding the optimal partitioning of a data set , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[63]  Stephen F. Smith,et al.  A learning system based on genetic adaptive algorithms , 1980 .

[64]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[65]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  Julia Handl,et al.  Improved Ant-Based Clustering and Sorting , 2002, PPSN.

[67]  Roy George,et al.  A variable-length genetic algorithm for clustering and classification , 1995, Pattern Recognit. Lett..

[68]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[69]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[70]  Peter Nordin,et al.  Evolutionary program induction of binary machine code and its applications , 1997 .

[71]  Leon G. Higley,et al.  Forensic Entomology: An Introduction , 2009 .

[72]  Sam Kwong,et al.  Ant Colony Clustering and Feature Extraction for Anomaly Intrusion Detection , 2006, Swarm Intelligence in Data Mining.

[73]  Anil K. Jain,et al.  A Mixture Model for Clustering Ensembles , 2004, SDM.

[74]  Sudipto Guha,et al.  ROCK: A Robust Clustering Algorithm for Categorical Attributes , 2000, Inf. Syst..

[75]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[76]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[77]  Sanghamitra Bandyopadhyay,et al.  Theoretical performance of genetic pattern classifier , 1999 .

[78]  Francesco Masulli,et al.  Soft Computing Applications , 2003 .

[79]  Christopher G. Langton,et al.  Artificial Life , 2019, Philosophical Posthumanism.

[80]  Juan Julián Merelo Guervós,et al.  Parallel Problem Solving from Nature — PPSN VII , 2002, Lecture Notes in Computer Science.

[81]  Zbigniew Michalewicz,et al.  Evolutionary Computation 1 , 2018 .

[82]  Pascale Kuntz,et al.  Emergent colonization and graph partitioning , 1994 .

[83]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[84]  K. Huang,et al.  A synergistic automatic clustering technique (SYNERACT) for multispectral image Analysis , 2002 .

[85]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[86]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[87]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[88]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[90]  W. Ames Mathematics in Science and Engineering , 1999 .

[91]  Wolfgang Hahn,et al.  Theory and Application of Liapunov's Direct Method , 1963 .

[92]  Jing Wang,et al.  Swarm Intelligence in Cellular Robotic Systems , 1993 .

[93]  Bin Zhang,et al.  Genera lized K- Harmonic Means - - Boosting in Unsupervised Learnin g , 2000 .

[94]  Gunar E. Liepins,et al.  Punctuated Equilibria in Genetic Search , 1991, Complex Syst..

[95]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[96]  William F. Punch,et al.  Ensembles of partitions via data resampling , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[97]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[98]  Jan A Snyman,et al.  Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms , 2005 .

[99]  Jean-Arcady Meyer,et al.  From Animals to Animats: Proceedings of The First International Conference on Simulation of Adaptive Behavior (Complex Adaptive Systems) , 1990 .

[100]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[101]  L. Jain,et al.  Evolutionary multiobjective optimization : theoretical advances and applications , 2005 .

[102]  Kenneth de Jong,et al.  Evolutionary computation: a unified approach , 2007, GECCO.

[103]  Baldo Faieta,et al.  Exploratory database analysis via self-organization , 1994 .

[104]  Pascale Kuntz,et al.  A Stochastic Heuristic for Visualising Graph Clusters in a Bi-Dimensional Space Prior to Partitioning , 1999, J. Heuristics.

[105]  Baldo Faieta,et al.  Diversity and adaptation in populations of clustering ants , 1994 .

[106]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[107]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[108]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[109]  Kenneth Alan De Jong,et al.  An analysis of the behavior of a class of genetic adaptive systems. , 1975 .

[110]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[111]  Bernhard Korte,et al.  Optimization and Operations Research , 1976 .

[112]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[113]  Erik K. Antonsson,et al.  Dynamic partitional clustering using evolution strategies , 2000, 2000 26th Annual Conference of the IEEE Industrial Electronics Society. IECON 2000. 2000 IEEE International Conference on Industrial Electronics, Control and Instrumentation. 21st Century Technologies.

[114]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[115]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1991 .

[116]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[117]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[118]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[119]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[120]  J. K. Lenstra,et al.  Local Search in Combinatorial Optimisation. , 1997 .

[121]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[122]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[123]  Manish Sarkar,et al.  A clustering algorithm using an evolutionary programming-based approach , 1997, Pattern Recognit. Lett..

[124]  R. Storn,et al.  Differential evolution a simple and efficient adaptive scheme for global optimization over continu , 1997 .

[125]  Vijay V. Raghavan,et al.  A clustering strategy based on a formalism of the reproductive process in natural systems , 1979, SIGIR 1979.

[126]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[127]  Tong Zhang,et al.  Convergence of Large Margin Separable Linear Classification , 2000, NIPS.

[128]  Josiane Zerubia,et al.  Fully unsupervised fuzzy clustering with entropy criterion , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[129]  Dr. Zbigniew Michalewicz,et al.  How to Solve It: Modern Heuristics , 2004 .

[130]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[131]  Anil K. Jain,et al.  Multiobjective data clustering , 2004, CVPR 2004.

[132]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[133]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[134]  Uday K. Chakraborty,et al.  Gene pool recombination, genetic algorithm, and the onemax function , 1997 .

[135]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[136]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[137]  D. Snyers,et al.  New results on an ant-based heuristic for highlighting the organization of large graphs , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[138]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[139]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[140]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[141]  Hajime Kita,et al.  A Comparison Study of Self-Adaptation in Evolution Strategies and Real-Coded Genetic Algorithms , 2001, Evolutionary Computation.

[142]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[143]  Mu-Chun Su,et al.  A new approach to clustering data with arbitrary shapes , 2005, Pattern Recognit..

[144]  David B. Fogel,et al.  Evolving artificial intelligence , 1992 .

[145]  David W. Corne,et al.  Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy , 2000, Evolutionary Computation.

[146]  K. Lee,et al.  A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice , 2005 .

[147]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[148]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[149]  Alfred Ultsch,et al.  Emergence in Self Organizing Feature Maps , 2007 .

[150]  Chien-Hsing Chou,et al.  A Competitive Learning Algorithm Using Symmetry , 1999 .

[151]  Weng-Kin Lai,et al.  Homogeneous Ants for Web Document Similarity Modeling and Categorization , 2002, Ant Algorithms.

[152]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[153]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[154]  Joshua Zhexue Huang,et al.  A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining , 1997, DMKD.

[155]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[156]  Nichael Lynn Cramer,et al.  A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.

[157]  Marco Dorigo,et al.  Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1d-som , 2003 .

[158]  Pedro Pina,et al.  Self-Organized Data and Image Retrieval as a Consequence of Inter-Dynamic Synergistic Relationships in Artificial Ant Colonies , 2002, HIS.

[159]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[160]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[161]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[162]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[163]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[164]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[165]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[166]  Cesare Alippi,et al.  Genetic-algorithm programming environments , 1994, Computer.

[167]  Lior Rokach,et al.  Soft Computing for Knowledge Discovery and Data Mining , 2007 .

[168]  Christophe Rosenberger,et al.  Unsupervised clustering method with optimal estimation of the number of clusters: application to image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[169]  Ziad Kobti,et al.  A multi-agent simulation using cultural algorithms: the effect of culture on the resilience of social systems , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[170]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[171]  Joshua D. Knowles,et al.  Evolutionary Multiobjective Clustering , 2004, PPSN.

[172]  Hirotaka Nakayama,et al.  Theory of Multiobjective Optimization , 1985 .

[173]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[174]  Ajith Abraham,et al.  Swarm Intelligence Algorithms for Data Clustering , 2008, Soft Computing for Knowledge Discovery and Data Mining.

[175]  J. Bezdek Numerical taxonomy with fuzzy sets , 1974 .

[176]  Joshua D. Knowles,et al.  Improvements to the scalability of multiobjective clustering , 2005, 2005 IEEE Congress on Evolutionary Computation.

[177]  James M. Keller,et al.  Fuzzy Models and Algorithms for Pattern Recognition and Image Processing , 1999 .

[178]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining , 2009, Swarm Intelligence in Data Mining.

[179]  Sanghamitra Bandyopadhyay,et al.  Pattern classification using genetic algorithms: Determination of H , 1998, Pattern Recognit. Lett..