A New Reduced-Length Genetic Representation for Evolutionary Multiobjective Clustering

The last decade has seen a growing body of research illustrating the advantages of the evolutionary multiobjective approach to data clustering. The scalability of such an approach, however, is a topic which merits more attention given the unprecedented volumes of data generated nowadays. This paper proposes a reduced-length representation for evolutionary multiobjective clustering. The new encoding explicitly prunes the solution space and allows the search method to focus on its most promising regions. Moreover, it allows us to precompute information in order to alleviate the computational overhead caused by the processing of candidate individuals during optimisation. We investigate the suitability of this proposal in the context of a representative algorithm from the literature: MOCK. Our results indicate that the new reduced-length representation significantly improves the effectiveness and computational efficiency of MOCK specifically, and can be seen as a further step towards a better scalability of evolutionary multiobjective clustering in general.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[4]  Franz Rothlauf,et al.  Redundant Representations in Evolutionary Computation , 2003, Evolutionary Computation.

[5]  Thomas Stützle,et al.  Exploratory Analysis of Stochastic Local Search Algorithms in Biobjective Optimization , 2010, Experimental Methods for the Analysis of Optimization Algorithms.

[6]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[7]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[8]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[9]  Hisao Ishibuchi,et al.  Modified Distance Calculation in Generational Distance and Inverted Generational Distance , 2015, EMO.

[10]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Clustering , 2015, ACM Comput. Surv..

[11]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[12]  Martin J. Oates,et al.  PESA-II: region-based selection in evolutionary multiobjective optimization , 2001 .

[13]  Gene H. Golub,et al.  Algorithms for Computing the Sample Variance: Analysis and Recommendations , 1983 .

[14]  Joshua D. Knowles,et al.  An Investigation of Representations and Operators for Evolutionary Data Clustering with a Variable Number of Clusters , 2006, PPSN.

[15]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[16]  Gilbert Syswerda,et al.  Uniform Crossover in Genetic Algorithms , 1989, ICGA.

[17]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[18]  Carlos M. Fonseca,et al.  Inferential Performance Assessment of Stochastic Optimisers and the Attainment Function , 2001, EMO.