Using Maximum Entropy and Generalized Belief Propagation in Estimation of Distribution Algorithms

EDAs work by sampling a population from a factorized distribution, like the Boltzmann distribution of an additively decomposable fitness function (ADF). In the Factorized Distribution Algorithm (FDA), a factorization is built from an ADF by choosing a subset of the factors. I present a new algorithm to merge factors into larger sets, allowing to account for all dependencies between the variables. Estimating larger subset distributions is more prone to sample noise, so the larger distribution can be estimated from the smaller ones with the Maximum Entropy method. Building an exact graphical model for sampling is often infeasible. E.g. in a 2-D grid the triangulated Markov network has linear clique size, thus exponentially large distributions. I explore ways to use loopy models and Generalized Belief Propagation in the context of EDA and optimization. The merging algorithm mentioned above can be combined with this.

[1]  Hans-Paul Schwefel,et al.  Parallel Problem Solving from Nature — PPSN IV , 1996, Lecture Notes in Computer Science.

[2]  I. Good,et al.  The Maximum Entropy Formalism. , 1979 .

[3]  E. Ising Beitrag zur Theorie des Ferromagnetismus , 1925 .

[4]  Frank Jensen,et al.  Optimal junction Trees , 1994, UAI.

[5]  R. Jirousek,et al.  On the effective implementation of the iterative proportional fitting procedure , 1995 .

[6]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[7]  R. Peierls On Ising's model of ferromagnetism , 1936, Mathematical Proceedings of the Cambridge Philosophical Society.

[8]  U. Montanari,et al.  Nonserial Dynamic Programming: On the Optimal Strategy of Variable Elimination for the Rectangular Lattice , 1972 .

[9]  Roberto Santana,et al.  Estimation of Distribution Algorithms with Kikuchi Approximations , 2005, Evolutionary Computation.

[10]  Heinz Mühlenbein,et al.  Schemata, Distributions and Graphical Models in Evolutionary Optimization , 1999, J. Heuristics.

[11]  Heinz Mühlenbein,et al.  FDA -A Scalable Evolutionary Algorithm for the Optimization of Additively Decomposed Functions , 1999, Evolutionary Computation.

[12]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[13]  Robin Hons,et al.  Estimation of Distribution Algorithms and Minimum Relative Entropy , 2005 .

[14]  M. Mézard,et al.  Spin Glass Theory and Beyond , 1987 .

[15]  E. T. Jaynes,et al.  Where do we Stand on Maximum Entropy , 1979 .

[16]  Umberto Bertelè,et al.  Nonserial Dynamic Programming , 1972 .

[17]  Heinz Mühlenbein,et al.  The Estimation of Distributions and the Minimum Relative Entropy Principle , 2005, Evol. Comput..

[18]  S. Kullback,et al.  Contingency tables with given marginals. , 1968, Biometrika.

[19]  Robert J. McEliece,et al.  Belief Propagation on Partially Ordered Sets , 2003, Mathematical Systems Theory in Biology, Communications, Computation, and Finance.

[20]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[21]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[22]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[23]  Adnan Darwiche,et al.  Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..

[24]  R. Kikuchi A Theory of Cooperative Phenomena , 1951 .