An adaptive scheme for real function optimization acting as a selection operator

We propose an adaptive scheme for real function optimization whose dynamics is driven by selection. The method is parametric and relies explicitly on the Gaussian density seen as an infinite search population. We define two gradient flows acting on the density parameters, in the spirit of neural network learning rules, which maximize either the function expectation relatively to the density or its logarithm. The first one leads to reinforcement learning and the second one leads to selection learning. Both can be understood as the effect of three operators acting on the density: translation, scaling, and rotation. Then we propose to approximate those systems with discrete time dynamical systems by means of three different methods: Monte Carlo integration, selection among a finite population, and reinforcement learning. This work synthesizes previously independent approaches and intends to show that evolutionary strategies and reinforcement learning are strongly related.

[1]  Jing Peng,et al.  Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .

[2]  Ralf Salomon,et al.  Evolutionary algorithms and gradient search: similarities and differences , 1998, IEEE Trans. Evol. Comput..

[3]  Vijaykumar Gullapalli A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[4]  A. Berny,et al.  Statistical machine learning and combinatorial optimization , 2001 .

[5]  Wing Shing Wong Matrix representation and gradient flows for NP-hard problems , 1995 .

[6]  William H. Press,et al.  Numerical recipes in C , 2002 .

[7]  R. Brockett Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems , 1991 .

[8]  Marcus Gallagher,et al.  Real-valued Evolutionary Optimization using a Flexible Probability Density Estimator , 1999, GECCO.

[9]  Rich Caruana,et al.  Removing the Genetics from the Standard Genetic Algorithm , 1995, International Conference on Machine Learning.

[10]  Joos Vandewalle,et al.  Locally implementable learning with isospectral matrix flows , 1993, ESANN.

[11]  Qin Lin,et al.  A unified algorithm for principal and minor components extraction , 1998, Neural Networks.

[12]  Shun-ichi Amari,et al.  Blind separation of uniformly distributed signals: a general approach , 1999, IEEE Trans. Neural Networks.

[13]  Thomas Bäck,et al.  Evolutionary computation: comments on the history and current state , 1997, IEEE Trans. Evol. Comput..

[14]  R. J. Williams Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 1992, Machine Learning.