Do not match, inherit: fitness surrogates for genetics-based machine learning techniques

A byproduct benefit of using probabilistic model-building genetic algorithms is the creation of cheap and accurate surrogate models. Learning classifier systems -- and genetics-based machine learning in general -- can greatly benefit from such surrogates which may replace the costly matching procedure of a rule against large data sets. In this paper we investigate the accuracy of such surrogate fitness functions when coupled with the probabilistic models evolved by the $\chi$-ary extended compact classifier system ($\chi$eCCS). To achieve such a goal, we show the need that the probabilistic models should be able to represent all the accurate basis functions required for creating an accurate surrogate. We also introduce a procedure to transform populations of rules based into dependency structure matrices (DSMs) which allows building accurate models of overlapping building blocks -- a necessary condition to accurately estimate the fitness of the evolved rules.

[1]  Martin Pelikan,et al.  Fitness Inheritance in the Bayesian Optimization Algorithm , 2004, GECCO.

[2]  John J. Grefenstette,et al.  Genetic Search with Approximate Function Evaluation , 1985, ICGA.

[3]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[4]  G. Harik Linkage Learning via Probabilistic Modeling in the ECGA , 1999 .

[5]  Robert E. Smith,et al.  Fitness inheritance in genetic algorithms , 1995, SAC '95.

[6]  Ester Bernadó-Mansilla,et al.  MOLeCS: A MultiObjective Learning Classifier System , 2000, GECCO.

[7]  David E. Goldberg,et al.  Genetic Algorithms and Walsh Functions: Part I, A Gentle Introduction , 1989, Complex Syst..

[8]  D. Goldberg,et al.  Probabilistic Model Building and Competent Genetic Programming , 2003 .

[9]  Georges R. Harik,et al.  Finding Multimodal Solutions Using Restricted Tournament Selection , 1995, ICGA.

[10]  Kalyanmoy Deb,et al.  Genetic Algorithms, Noise, and the Sizing of Populations , 1992, Complex Syst..

[11]  David E. Goldberg,et al.  The - ary extended compact classifier system: Linkage learning in Pittsburgh LCS , 2007 .

[12]  Fernando G. Lobo,et al.  Extended Compact Genetic Algorithm in C , 1999 .

[13]  Hussein A. Abbass,et al.  Sub-structural niching in estimation of distribution algorithms , 2005, GECCO '05.

[14]  David E. Goldberg,et al.  The Design of Innovation: Lessons from and for Competent Genetic Algorithms , 2002 .

[15]  David E. Goldberg,et al.  Conquering hierarchical difficulty by explicit chunking: substructural chromosome compression , 2006, GECCO '06.

[16]  Carlos Iñaki Gutierrez,et al.  Integration analysis of product architecture to support effective team co-location , 1998 .

[17]  David E. Goldberg,et al.  Evaluation relaxation using substructural information and linear estimation , 2006, GECCO '06.

[18]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[19]  Franz Rothlauf,et al.  Evaluation-Relaxation Schemes for Genetic and Evolutionary Algorithms , 2004 .

[20]  D. V. Steward,et al.  The design structure system: A method for managing the design of complex systems , 1981, IEEE Transactions on Engineering Management.

[21]  David E. Goldberg,et al.  Efficiency enhancement of genetic algorithms via building-block-wise fitness estimation , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[22]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[23]  Xavier Llorà,et al.  Fast rule matching for learning classifier systems via vector instructions , 2006, GECCO '06.

[24]  Samir W. Mahfoud Population Size and Genetic Drift in Fitness Sharing , 1994, FOGA.

[25]  Yaochu Jin,et al.  A comprehensive survey of fitness approximation in evolutionary computation , 2005, Soft Comput..

[26]  Martin V. Butz,et al.  Extracted global structure makes local building block processing effective in XCS , 2005, GECCO '05.

[27]  Xavier Llorà,et al.  Combating user fatigue in iGAs: partial ordering, support vector machines, and synthetic fitness , 2005, GECCO '05.

[28]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[29]  J. -F. M. Barthelemy,et al.  Approximation concepts for optimum structural design — a review , 1993 .

[30]  Martin Pelikan,et al.  Hierarchical Bayesian optimization algorithm: toward a new generation of evolutionary algorithms , 2010, SICE 2003 Annual Conference (IEEE Cat. No.03TH8734).

[31]  David E. Goldberg,et al.  Genetic Algorithm Design Inspired by Organizational Theory: Pilot Study of a Dependency Structure Matrix Driven Genetic Algorithm , 2003, GECCO.

[32]  Dale Borowiak,et al.  Linear Models, Least Squares and Alternatives , 2001, Technometrics.

[33]  David E. Goldberg,et al.  Designing Competent Mutation Operators Via Probabilistic Model Building of Neighborhoods , 2004, GECCO.

[34]  Joseph G. Pigeon,et al.  Statistics for Experimenters: Design, Innovation and Discovery , 2006, Technometrics.

[35]  Kalyanmoy Deb,et al.  Messy Genetic Algorithms: Motivation, Analysis, and First Results , 1989, Complex Syst..