Fitness Expectation Maximization

We present Fitness Expectation Maximization FEM, a novel method for performing 'black box' function optimization. FEM searches the fitness landscape of an objective function using an instantiation of the well-known Expectation Maximization algorithm, producing search points to match the sample distribution weighted according to higher expected fitness. FEM updates both candidate solution parameters and the search policy, which is represented as a multinormal distribution. Inheriting EM's stability and strong guarantees, the method is both elegant and competitive with some of the best heuristic search methods in the field, and performs well on a number of unimodal and multimodal benchmark tasks. To illustrate the potential practical applications of the approach, we also show experiments on finding the parameters for a controller of the challenging non-Markovian double pole balancing task.

[1]  H. Chernoff,et al.  Elementary Decision Theory. , 1960 .

[2]  G. Sutton Engineering aspects of magnetohydrodynamics , 1960 .

[3]  Hans-Paul Schwefel,et al.  TWO-PHASE NOZZLE AND HOLLOW CORE JET EXPERIMENTS. , 1970 .

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  H. Chernoff,et al.  Elementary Decision Theory. , 1988 .

[6]  A. P. Wieland,et al.  Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[7]  Risto Miikkulainen,et al.  Efficient Reinforcement Learning through Symbiotic Evolution , 1996, Machine Learning.

[8]  Geoffrey E. Hinton,et al.  Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[9]  Risto Miikkulainen,et al.  Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..

[10]  Marcus Gallagher,et al.  Real-valued Evolutionary Optimization using a Flexible Probability Density Estimator , 1999, GECCO.

[11]  James C. Spall,et al.  Stochastic optimization and the simultaneous perturbation method , 1999, WSC '99.

[12]  Risto Miikkulainen,et al.  Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[13]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[14]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[15]  J. Spall,et al.  Theoretical framework for comparing several popular stochastic optimization approaches , 2002 .

[16]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[17]  Pedro Larrañaga,et al.  Mathematical modelling of UMDAc algorithm with tournament selection. Behaviour on linear and quadratic functions , 2002, Int. J. Approx. Reason..

[18]  Christian Igel Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[19]  J. Spall,et al.  Theoretical Framework for Comparing Several Stochastic Optimization Approaches , 2004 .

[20]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[21]  Dirk P. Kroese,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning , 2004 .

[22]  Dirk P. Kroese,et al.  The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .

[23]  Dirk P. Kroese,et al.  The Cross-Entropy Method , 2004, Information Science and Statistics.

[24]  Jing J. Liang,et al.  Problem Definitions and Evaluation Criteria for the CEC 2005 Special Session on Real-Parameter Optimization , 2005 .

[25]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[26]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[27]  G. Calafiore,et al.  Probabilistic and Randomized Methods for Design under Uncertainty , 2006 .

[28]  Risto Miikkulainen,et al.  Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[29]  Nikolaus Hansen,et al.  An Analysis of Mutative -Self-Adaptation on Linear Fitness Functions , 2006, Evolutionary Computation.

[30]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[31]  Stefan Schaal,et al.  Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.

[32]  David H. Wolpert,et al.  Parametric Learning and Monte Carlo Optimization , 2007, ArXiv.