Adapted Maximum-Likelihood Gaussian Models for Numerical Optimization with Continuous EDAs

htmlabstractThis article focuses on numerical optimization with continuous Estimation-of-Distribution Algorithms (EDAs). Specifically, the focus is on the use of one of the most common and best understood probability distributions: the normal distribution. We first give an overview of the existing research on this topic. We then point out a source of inefficiency in EDAs that make use of the normal distribution with maximum-likelihood (ML) estimates. Scaling the covariance matrix beyond its ML estimate does not remove this inefficiency. To remove the inefficiency, the orientation of the normal distribution must be changed. So far, only Evolution Strategies (ES) and particularly Covariance Matrix Adaptation ES (CMA-ES) are capable of achieving such re-orientation. In this article we provide a simple, but effective technique for achieving re-orientation while still only performing the well-known ML estimates. We call the new technique Anticipated Mean Shift (AMS). The resulting EDA, called Adapted Maximum-Likelihood Gaussian Model -- Iterated Density-Estimation Evolutionary Algorithm (AMaLGaM-IDEA) adapts not only the ML estimate for the covariance matrix, but also the ML estimate for the mean. AMaLGaM-IDEA has an improved performance compared to previous EDAs that use ML estimates as well as compared to previous EDAs that scale the variance adaptively. Also, we indicate the circumstances under which AMaLGaM-IDEA is found to be robust to rotations of the search space. A comparison with CMA-ES identifies the conditions under which AMaLGaM-IDEA is able to outperform CMA-ES and vice versa. We conclude that AMaLGaM-IDEA is currently among the most efficient real-valued continuous EDAs while at the same time it is relatively simple to understand (especially in the naive, univariate case). Pseudo-code is provided in this article; source-code can be downloaded from the web.

[1]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[2]  T. Gerig Multivariate Analysis: Techniques for Educational and Psychological Research , 1975 .

[3]  M. Abramowitz,et al.  Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[4]  Thomas Bäck,et al.  An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.

[5]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, ICML.

[6]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[7]  Michèle Sebag,et al.  Extending Population-Based Incremental Learning to Continuous Search Spaces , 1998, PPSN.

[8]  P. Bosman,et al.  Continuous iterated density estimation evolutionary algorithms within the IDEA framework , 2000 .

[9]  Pedro Larrañaga,et al.  Optimization in Continuous Domains by Learning and Simulation of Gaussian Networks , 2000 .

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  Dirk Thierens,et al.  Expanding from Discrete to Continuous Estimation of Distribution Algorithms: The IDEA , 2000, PPSN.

[12]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[13]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[14]  Dirk Thierens,et al.  Advancing continuous IDEAs with mixture distributions and factorization selection metrics , 2001 .

[15]  Pedro Larrañaga,et al.  Mathematical modelling of UMDAc algorithm with tournament selection. Behaviour on linear and quadratic functions , 2002, Int. J. Approx. Reason..

[16]  Petros Koumoutsakos,et al.  A Mixed Bayesian Optimization Algorithm with Variance Adaptation , 2004, PPSN.

[17]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[18]  Dirk Thierens,et al.  The Naive MIDEA: A Baseline Multi-objective EA , 2005, EMO.

[19]  Heinz Mühlenbein,et al.  The Estimation of Distributions and the Minimum Relative Entropy Principle , 2005, Evol. Comput..

[20]  Jörn Grahl,et al.  Behaviour of UMDAc algorithm with truncation selection on monotonous functions , 2005 .

[21]  Franz Rothlauf,et al.  The correlation-triggered adaptive variance scaling IDEA , 2006, GECCO.

[22]  Dirk Thierens,et al.  Numerical Optimization with Real-Valued Estimation-of-Distribution Algorithms , 2006, Scalable Optimization via Probabilistic Modeling.

[23]  Bo Yuan,et al.  A Mathematical Modelling Technique for the Analysis of the Dynamics of a Simple Continuous EDA , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[24]  Martin Pelikan,et al.  Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications (Studies in Computational Intelligence) , 2006 .

[25]  J. A. Lozano,et al.  Towards a New Evolutionary Computation: Advances on Estimation of Distribution Algorithms (Studies in Fuzziness and Soft Computing) , 2006 .

[26]  Pedro Larrañaga,et al.  Towards a New Evolutionary Computation - Advances in the Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.

[27]  Yun Shang,et al.  A Note on the Extended Rosenbrock Function , 2006 .

[28]  Peter A. N. Bosman,et al.  Convergence phases, variance trajectories, and runtime analysis of continuous EDAs , 2007, GECCO '07.

[29]  Franz Rothlauf,et al.  SDR: a better trigger for adaptive variance scaling in normal EDAs , 2007, GECCO '07.

[30]  Hua Xu,et al.  Cross entropy and adaptive variance scaling in continuous EDA , 2007, GECCO '07.

[31]  Peter A. N. Bosman,et al.  Matching inductive search bias and problem structure in continuous Estimation-of-Distribution Algorithms , 2008, Eur. J. Oper. Res..