Natural evolution strategies converge on sphere functions

This theoretical investigation gives the first proof of convergence for (radial) natural evolution strategies, on d-dimensional sphere functions, and establishes the conditions on hyper-parameters, as a function of d. For the limit case of large population sizes we show asymptotic linear convergence, and in the limit of small learning rates we give a full analytic characterization of the algorithm dynamics, decomposed into transient and asymptotic phases. Finally, we show why omitting the natural gradient is catastrophic.

