Population Markov Chain Monte Carlo

Stochastic search algorithms inspired by physical and biological systems are applied to the problem of learning directed graphical probability models in the presence of missing observations and hidden variables. For this class of problems, deterministic search algorithms tend to halt at local optima, requiring random restarts to obtain solutions of acceptable quality. We compare three stochastic search algorithms: a Metropolis-Hastings Sampler (MHS), an Evolutionary Algorithm (EA), and a new hybrid algorithm called Population Markov Chain Monte Carlo, or popMCMC. PopMCMC uses statistical information from a population of MHSs to inform the proposal distributions for individual samplers in the population. Experimental results show that popMCMC and EAs learn more efficiently than the MHS with no information exchange. Populations of MCMC samplers exhibit more diversity than populations evolving according to EAs not satisfying physics-inspired local reversibility conditions.

[1]  P. Dirac Principles of Quantum Mechanics , 1982 .

[2]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[3]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[4]  R. Weinstock Calculus of Variations: with Applications to Physics and Engineering , 1952 .

[5]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[6]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[7]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[8]  W. H. Sewell,et al.  Social Class, Parental Encouragement, and Educational Aspirations , 1968, American Journal of Sociology.

[9]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[10]  T. Kuhn The Structure of Scientific Revolutions 2nd edition , 1970 .

[11]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[12]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  D. Spiegelhalter,et al.  Bayes Factors and Choice Criteria for Linear Models , 1980 .

[15]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Anne Lohrli Chapman and Hall , 1985 .

[17]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[18]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[19]  Gilbert Syswerda,et al.  Uniform Crossover in Genetic Algorithms , 1989, ICGA.

[20]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[21]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[22]  Peter M. Todd,et al.  Designing Neural Networks using Genetic Algorithms , 1989, ICGA.

[23]  Hiroaki Kitano,et al.  Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..

[24]  Kenneth A. De Jong,et al.  An Analysis of the Interacting Roles of Population Size and Crossover in Genetic Algorithms , 1990, PPSN.

[25]  J. N. R. Jeffers,et al.  Graphical Models in Applied Multivariate Statistics. , 1990 .

[26]  David J. Spiegelhalter,et al.  Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[27]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[28]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[29]  D. Fogel System Identification Through Simulated Evolution: A Machine Learning Approach to Modeling , 1991 .

[30]  W. Jefferys Sharpening Ockham ' s Razor on a Bayesian Strop ( Key terms : Bayes ' theorem ; Ockham ' s razor ) , 1991 .

[31]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[32]  José Carlos Príncipe,et al.  A Markov Chain Framework for the Simple Genetic Algorithm , 1993, Evolutionary Computation.

[33]  Russell G. Almond,et al.  Strategies for Graphical Model Selection , 1994 .

[34]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms , 1994 .

[35]  Peter Cheeseman,et al.  Selecting Models from Data: Artificial Intelligence and Statistics IV , 1994 .

[36]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[37]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[38]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[39]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[40]  Hans-Paul Schwefel,et al.  Evolution and optimum seeking , 1995, Sixth-generation computer technology series.

[41]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[42]  David Heckerman,et al.  Asymptotic Model Selection for Directed Networks with Hidden Variables , 1996, UAI.

[43]  Nir Friedman,et al.  Learning Bayesian Networks with Local Structure , 1996, UAI.

[44]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms: A Performance Analysis of Control Parameters , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[46]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[47]  Moninder Singh,et al.  Learning Bayesian Networks from Incomplete Data , 1997, AAAI/IAAI.

[48]  David B. Fogel,et al.  Evolutionary algorithms in theory and practice , 1997, Complex.

[49]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[50]  Jim Q. Smith,et al.  On the Geometry of Bayesian Graphical Models with Hidden Variables , 1998, UAI.

[51]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[52]  Dan Geiger,et al.  Graphical Models and Exponential Families , 1998, UAI.

[53]  William M. Spears,et al.  Simple Subpopulation Schemes , 1998 .

[54]  Michael I. Jordan Graphical Models , 1998 .

[55]  G. Roberts,et al.  Adaptive Markov Chain Monte Carlo through Regeneration , 1998 .

[56]  Kathryn B. Laskey,et al.  Stochastic algorithms for learning with incomplete data: an application to bayesian networks , 1999 .

[57]  A. Dawid,et al.  Prequential probability: principles and properties , 1999 .

[58]  Kathryn B. Laskey,et al.  Learning Bayesian Networks from Incomplete Data with Stochastic Search Algorithms , 1999, UAI.

[59]  Kathryn B. Laskey,et al.  Learning Bayesian networks from incomplete data using evolutionary algorithms , 1999 .

[60]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[61]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[62]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[63]  Adrian E. Raftery,et al.  Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data , 2005, Bioinform..