Measuring and Optimizing Behavioral Complexity for Evolutionary Reinforcement Learning

Model complexity is key concern to any artificial learning system due its critical impact on generalization. However, EC research has only focused phenotype structural complexity for static problems. For sequential decision tasks, phenotypes that are very similar in structure, can produce radically different behaviors, and the trade-off between fitness and complexity in this context is not clear. In this paper, behavioral complexity is measured explicitly using compression, and used as a separate objective to be optimized (not as an additional regularization term in a scalar fitness), in order to study this trade-off directly.

[1]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[2]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[3]  J. Van Leeuwen,et al.  Handbook of Theoretical Computer Science , 1990 .

[4]  Byoung-Tak Zhang,et al.  Evolving Optimal Neural Networks Using Genetic Algorithms with Occam's Razor , 1993, Complex Syst..

[5]  Hitoshi Iba,et al.  Genetic programming using a minimum description length principle , 1994 .

[6]  J. K. Kinnear,et al.  Advances in Genetic Programming , 1994 .

[7]  Byoung-Tak Zhang,et al.  Balancing Accuracy and Parsimony in Genetic Programming , 1995, Evolutionary Computation.

[8]  Schloss Birlinghoven,et al.  MDL-Based Fitness Functions for Learning Parsimonious Programs , 1995 .

[9]  Byoung-Tak Zhang,et al.  Evolutionary Induction of Sparse Neural Trees , 1997, Evolutionary Computation.

[10]  Edwin D. de Jong,et al.  Reducing bloat and promoting diversity using multi-objective methods , 2001 .

[11]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[12]  Ernesto Benini,et al.  Genetic Diversity as an Objective in Multi-Objective Evolutionary Algorithms , 2003, Evolutionary Computation.

[13]  Edwin D. de Jong,et al.  Multi-Objective Methods for Tree Size Control , 2003, Genetic Programming and Evolvable Machines.

[14]  Vittorio Loreto,et al.  Artificial sequences and complexity measures , 2004, cond-mat/0403233.

[15]  Faustino J. Gomez,et al.  Sustaining diversity using behavioral information distance , 2009, GECCO.