论文信息 - Variational MCMC

Variational MCMC

We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel and a block Metropolis-Hastings (MH) kernel with a variational approximation as proposal distribution. The MH kernel allows one to locate regions of high probability efficiently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variational approximations because it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms because it locates the regions of high probability quickly, thus speeding up convergence. We also present an adaptive MCMC algorithm that iterates between improving the variational approximation and improving the MCMC approximation. We demonstrate the algorithms on the problem of Bayesian parameter estimation for logistic (sigmoid) belief networks.

Nando de Freitas | Stuart J. Russell | Pedro A. d. F. R. Højen-Sørensen | N. D. Freitas

[1] J. Geweke,et al. Bayesian Inference in Econometric Models Using Monte Carlo Integration , 1989 .

[2] R. Tweedie,et al. Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[3] Radford M. Neal. Sampling from multimodal distributions using tempered transitions , 1996, Stat. Comput..

[4] Michael I. Jordan,et al. Variational methods and the QMR-DT database , 1998 .

[5] Michael I. Jordan,et al. Unsupervised Learning from Dyadic Data , 1998 .

[6] A. Doucet,et al. Joint Bayesian detection and estimation of noisy sinusoids via reversible jump MCMC , 1998 .

[7] G. Roberts,et al. Adaptive Markov Chain Monte Carlo through Regeneration , 1998 .

[8] Michael I. Jordan,et al. Variational Probabilistic Inference and the QMR-DT Network , 2011, J. Artif. Intell. Res..

[9] Christophe Andrieu,et al. Joint Bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC , 1999, IEEE Trans. Signal Process..

[10] Zoubin Ghahramani,et al. Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[11] Hoon Kim,et al. Monte Carlo Statistical Methods , 2000, Technometrics.

[12] Michael I. Jordan,et al. Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[13] Nando de Freitas,et al. Robust Full Bayesian Learning for Radial Basis Networks , 2001, Neural Computation.