Global linear convergence of Evolution Strategies with recombination on scaling-invariant functions

Evolution Strategies (ES) are stochastic derivative-free optimization algorithms whose most prominent representative, the CMA-ES algorithm, is widely used to solve difficult numerical optimization problems. We provide the first rigorous investigation of the linear convergence of step-size adaptive ES involving a population and recombination, two ingredients crucially important in practice to be robust to local irregularities or multimodality. Our methodology relies on investigating the stability of a Markov chain associated to the algorithm. Our stability study is crucially based on recent developments connecting the stability of deterministic control models to the stability of associated Markov chains. We investigate convergence on composites of strictly increasing functions with continuously differentiable scaling-invariant functions with a global optimum. This function class includes functions with non-convex sublevel sets and discontinuous functions. We prove the existence of a constant r such that the logarithm of the distance to the optimum divided by the number of iterations of step-size adaptive ES with weighted recombination converges to r. The constant is given as an expectation with respect to the stationary distribution of a Markov chain—its sign allows to infer linear convergence or divergence of the ES and is found numerically. Our main condition for convergence is the increase of the expected log stepsize on linear functions. In contrast to previous results, our condition is equivalent to the almost sure geometric divergence of the step-size.

[1]  Petros Koumoutsakos,et al.  Learning probability distributions in continuous evolutionary algorithms – a comparative review , 2004, Natural Computing.

[2]  Olivier Teytaud,et al.  General Lower Bounds for Evolutionary Algorithms , 2006, PPSN.

[3]  Anne Auger,et al.  Cumulative Step-Size Adaptation on Linear Functions , 2012, PPSN.

[4]  Tom Schaul,et al.  Natural evolution strategies converge on sphere functions , 2012, GECCO '12.

[5]  Christian Igel,et al.  A computational efficient covariance matrix update and a (1+1)-CMA for evolution strategies , 2006, GECCO.

[6]  Anne Auger,et al.  Evolution Strategies , 2018, Handbook of Computational Intelligence.

[7]  Anne Auger,et al.  Impacts of invariance in search: When CMA-ES and PSO face ill-conditioned and non-separable problems , 2011, Appl. Soft Comput..

[8]  Raymond Ros,et al.  A Simple Modification in CMA-ES Achieving Linear Time and Space Complexity , 2008, PPSN.

[9]  Nikolaus Hansen,et al.  A Derandomized Approach to Self-Adaptation of Evolution Strategies , 1994, Evolutionary Computation.

[10]  Anne Auger,et al.  Drift theory in continuous search spaces: expected hitting time of the (1 + 1)-ES with 1/5 success rule , 2018, GECCO.

[11]  Anne Auger,et al.  Linear Convergence on Positively Homogeneous Functions of a Comparison Based Step-Size Adaptive Randomized Search: the (1+1) ES with Generalized One-fifth Success Rule , 2013, ArXiv.

[12]  A. Auger,et al.  Verifiable conditions for the irreducibility and aperiodicity of Markov chains by analyzing underlying deterministic models , 2015, Bernoulli.

[13]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[14]  Cornelia Kappler,et al.  Are Evolutionary Algorithms Improved by Large Mutations? , 1996, PPSN.

[15]  A. Auger Convergence results for the ( 1 , )-SA-ES using the theory of-irreducible Markov chains , 2005 .

[16]  Jens Jägersküpper,et al.  Rigorous Runtime Analysis of the (1+1) ES: 1/5-Rule and Ellipsoidal Fitness Landscapes , 2005, FOGA.

[17]  Hans-Georg Beyer,et al.  Random Dynamics Optimum Tracking with Evolution Strategies , 2002, PPSN.

[18]  R. Tweedie The existence of moments for stationary Markov chains , 1983, Journal of Applied Probability.

[19]  Anne Auger,et al.  Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles , 2011, J. Mach. Learn. Res..

[20]  Sean P. Meyn,et al.  Asymptotic behavior of stochastic systems possessing Markovian realizations , 1991 .

[21]  Jens Jägersküpper,et al.  How the (1+1) ES using isotropic mutations minimizes positive definite quadratic forms , 2006, Theor. Comput. Sci..

[22]  Dirk V. Arnold,et al.  Weighted multirecombination evolution strategies , 2006, Theor. Comput. Sci..

[23]  Tom Schaul,et al.  High dimensions and heavy tails for natural evolution strategies , 2011, GECCO '11.

[24]  Jens Jägersküpper,et al.  Algorithmic analysis of a basic evolutionary algorithm for continuous optimization , 2007, Theor. Comput. Sci..

[25]  Nikolaus Hansen,et al.  An Analysis of Mutative -Self-Adaptation on Linear Fitness Functions , 2006, Evolutionary Computation.

[26]  Jens Jägersküpper,et al.  Analysis of a Simple Evolutionary Algorithm for Minimization in Euclidean Spaces , 2003, ICALP.

[27]  Anne Auger,et al.  Linear Convergence of Comparison-based Step-size Adaptive Randomized Search via Stability of Markov Chains , 2013, SIAM J. Optim..

[28]  Olivier François,et al.  Global convergence for evolution strategies in spherical problems: some simple proofs and difficulties , 2003, Theor. Comput. Sci..

[29]  Serge Gratton,et al.  Globally convergent evolution strategies , 2015, Math. Program..

[30]  Anne Auger,et al.  When Do Heavy-Tail Distributions Help? , 2006, PPSN.

[31]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[32]  Christian Igel,et al.  Efficient covariance matrix update for variable metric evolution strategies , 2009, Machine Learning.

[33]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[34]  Anne Auger,et al.  Scaling-invariant Functions versus Positively Homogeneous Functions , 2021, Journal of Optimization Theory and Applications.

[35]  S. T. Jensen,et al.  ON THE LAW OF LARGE NUMBERS FOR (GEOMETRICALLY) ERGODIC MARKOV CHAINS , 2007, Econometric Theory.

[36]  Nikolaus Hansen,et al.  Evaluating the CMA Evolution Strategy on Multimodal Test Functions , 2004, PPSN.

[37]  Tom Schaul,et al.  Exponential natural evolution strategies , 2010, GECCO '10.

[38]  Youhei Akimoto,et al.  Generalized drift analysis in continuous domain: linear convergence of (1 + 1)-ES on strongly convex functions with Lipschitz continuous gradients , 2019, FOGA '19.

[39]  Dirk V. Arnold,et al.  Optimal Weighted Recombination , 2005, FOGA.

[40]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[41]  Xin Yao,et al.  Fast Evolution Strategies , 1997, Evolutionary Programming.