Messy Genetic Algorithms: Motivation, Analysis, and First Results

This paper defines and explores a somewhat different type of genetic algorithm (GA)-a messy genetic algorithm (mGA). Messy GAs process variable-length strings that may be either under- or overspecified with respect to the problem being solved. As nature has formed its genotypes by progressing from simple to more complex life forms, messy GAs solve problems by combining relatively short, well-tested building blocks to form longer, more complex strings that increasingly cover all features of a problem. This approach stands in contrast to the usual fixed-length, fixed-coding genetic algorithm, where the existence of the requisite tight linkage is taken for granted or ignored altogether. To compare the two approaches, a 30-bit, order-three-deceptive problem is searched using a simple GA and a messy GA. Using a random but fixed ordering of the bits, the simple GA makes errors at roughly three-quarters of its positions; under a worst-case ordering, the simple GA errs at all positions. In contrast to the simple GA results, the messy GA repeatedly solves the same problem to optimality. Prior to this time, no GA had ever solved a provably difficult problem to optimality without prior knowledge of good string arrangements. The mGA presented herein repeatedly achieves globally optimal results without such knowledge, and it does so at the very first generation in which strings are long enough to cover the problem. The solution of a difficult nonlinear problem to optimality suggests that messy GAs can solve more difficult problems than has been possible to date with other genetic algorithms. The ramifications of these techniques in search and machine learning are explored, including the possibility of messy floating-point codes, messy permutations, and messy classifiers.