Interdependent Gibbs Samplers

Gibbs sampling, as a model learning method, is known to produce the most accurate results available in a variety of domains, and is a de facto standard in these domains. Yet, it is also well known that Gibbs random walks usually have bottlenecks, sometimes termed "local maxima", and thus samplers often return suboptimal solutions. In this paper we introduce a variation of the Gibbs sampler which yields high likelihood solutions significantly more often than the regular Gibbs sampler. Specifically, we show that combining multiple samplers, with certain dependence (coupling) between them, results in higher likelihood solutions. This side-steps the well known issue of identifiability, which has been the obstacle to combining samplers in previous work. We evaluate the approach on a Latent Dirichlet Allocation model, and also on HMM's, where precise computation of likelihoods and comparisons to the standard EM algorithm are possible.

[1]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[2]  David B. Dunson,et al.  Bayesian data analysis, third edition , 2013 .

[3]  Wenguang Chen,et al.  WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation , 2015, Proc. VLDB Endow..

[4]  Thomas L. Griffiths,et al.  Probabilistic Topic Models , 2007 .

[5]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[6]  David Broman,et al.  Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models , 2015, 1506.03784.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[9]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[10]  Padhraic Smyth,et al.  Distributed and accelerated inference algorithms for probabilistic graphical models , 2011 .

[11]  James R. Foulds,et al.  Dense Distributions from Sparse Samples: Improved Gibbs Sampling Parameter Estimators for LDA , 2015, J. Mach. Learn. Res..

[12]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[13]  Alexander J. Smola,et al.  Reducing the sampling complexity of topic models , 2014, KDD.

[14]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Simon J. Godsill,et al.  Marginal maximum a posteriori estimation using Markov chain Monte Carlo , 2002, Stat. Comput..

[16]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[17]  Chong Wang,et al.  Asymptotically Exact, Embarrassingly Parallel MCMC , 2013, UAI.

[18]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..