An Empirical Comparison Between K-Means, GAs and EDAs in Partitional Clustering

In this chapter we empirically compare the performance of three different approaches to partitional clustering. These are iterative hillclimbing algorithms, genetic algorithms and estimation of distribution algorithms. Emphasis is placed on their ability to avoid local maxima and also on the simplicity of setting good parameters for them.

[1]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[2]  Rita Cucchiara,et al.  Analysis and Comparison of different Genetic Models for the Clustering problem in Image Analysis , 1993 .

[3]  Manish Sarkar,et al.  A clustering algorithm using an evolutionary programming-based approach , 1997, Pattern Recognit. Lett..

[4]  H. Bozdogan Choosing the Number of Clusters, Subset Selection of Variables, and Outlier Detection in the Standard Mixture-Model Cluster Analysis , 1994 .

[5]  Vijay V. Raghavan,et al.  A clustering strategy based on a formalism of the reproductive process in natural systems , 1979, SIGIR 1979.

[6]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[9]  Hans Spada,et al.  Learning in Humans and Machines: Towards an Interdisciplinary Learning Science , 1995 .

[10]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[11]  Josep Roure Alcobé,et al.  Robust Incremental Clustering with Bad Instance Orderings: A New Strategy , 1998, IBERAMIA.

[12]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[14]  James C. Bezdek,et al.  Genetic algorithm guided clustering , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[15]  Jean-Paul Rasson,et al.  The gap test: an optimal method for determining the number of natural classes in cluster analysis , 1994 .

[16]  Donald R. Jones,et al.  Solving Partitioning Problems with Genetic Algorithms , 1991, International Conference on Genetic Algorithms.

[17]  Pedro Larrañaga,et al.  Partitional Cluster Analysis with Genetic Algorithms: Searching for the Number of Clusters , 1998 .

[18]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[19]  Rowena Cole,et al.  Clustering with genetic algorithms , 1998 .

[20]  Ling Xu,et al.  Ordering Effects in Clustering , 1992, ML.

[21]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[22]  Gilbert Syswerda,et al.  Simulated Crossover in Genetic Algorithms , 1992, FOGA.

[23]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[24]  G. Celeux,et al.  An entropy criterion for assessing the number of clusters in a mixture model , 1996 .

[25]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[26]  A. D. Gordon A Review of Hierarchical Classification , 1987 .

[27]  R. Michalski,et al.  Learning from Observation: Conceptual Clustering , 1983 .

[28]  Peter Cheeseman,et al.  Bayesian classification theory , 1991 .

[29]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[30]  C. B. Lucasius,et al.  On k-medoid clustering of large data sets with the aid of a genetic algorithm: background, feasiblity and comparison , 1993 .

[31]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[32]  David E. Goldberg,et al.  Genetic Algorithms, Clustering, and the Breaking of Symmetry , 2000, PPSN.

[33]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[34]  C. Alippi,et al.  Cluster partitioning in image analysis classification: a genetic algorithm approach , 1992, CompEuro 1992 Proceedings Computer Systems and Software Engineering.

[35]  Pat Langley,et al.  Elements of Machine Learning , 1995 .

[36]  Henri Luchian,et al.  Evolutionary automated classification , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[37]  John R. Anderson,et al.  Machine learning - an artificial intelligence approach , 1982, Symbolic computation.

[38]  M. Narasimha Murty,et al.  Clustering with evolution strategies , 1994, Pattern Recognit..