Underdamped Langevin MCMC: A non-asymptotic analysis

We study the underdamped Langevin diffusion when the log of the target distribution is smooth and strongly concave. We present a MCMC algorithm based on its discretization and show that it achieves $\varepsilon$ error (in 2-Wasserstein distance) in $\mathcal{O}(\sqrt{d}/\varepsilon)$ steps. This is a significant improvement over the best known rate for overdamped Langevin MCMC, which is $\mathcal{O}(d/\varepsilon^2)$ steps under the same smoothness/concavity assumptions. The underdamped Langevin MCMC scheme can be viewed as a version of Hamiltonian Monte Carlo (HMC) which has been observed to outperform overdamped Langevin MCMC methods in a number of application areas. We provide quantitative rates that support this empirical wisdom.

[1]  H. Kramers Brownian motion in a field of force and the diffusion model of chemical reactions , 1940 .

[2]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[3]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[4]  S. Mitter,et al.  Recursive stochastic algorithms for global optimization in R d , 1991 .

[5]  R. Tweedie,et al.  Exponential convergence of Langevin distributions and their discrete approximations , 1996 .

[6]  J. Silvester Determinants of block matrices , 2000, The Mathematical Gazette.

[7]  S. Dragomir Some Gronwall Type Inequalities and Applications , 2003 .

[8]  F. Hérau,et al.  Isotropic Hypoellipticity and Trend to Equilibrium for the Fokker-Planck Equation with a High-Degree Potential , 2004 .

[9]  G. Parisi Brownian motion , 2005, Nature.

[10]  Vladas Sidoravicius,et al.  Stochastic Processes and Applications , 2007 .

[11]  C. Villani Optimal Transport: Old and New , 2008 .

[12]  A. Guillin,et al.  Trend to equilibrium and particle approximation for a weakly selfconsistent Vlasov-Fokker-Planck equation , 2009, 0906.1417.

[13]  C. Mouhot,et al.  HYPOCOERCIVITY FOR LINEAR KINETIC EQUATIONS CONSERVING MASS , 2010, 1005.1495.

[14]  Simone Calogero,et al.  Exponential Convergence to Equilibrium for Kinetic Fokker-Planck Equations , 2010, 1009.5086.

[15]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[16]  S. Mischler,et al.  Exponential Stability of Slowly Decaying Solutions to the Kinetic-Fokker-Planck Equation , 2014, Archive for Rational Mechanics and Analysis.

[17]  M. Betancourt,et al.  The Geometric Foundations of Hamiltonian Monte Carlo , 2014, 1410.5110.

[18]  A. Dalalyan Theoretical guarantees for approximate sampling from smooth and log‐concave densities , 2014, 1412.7392.

[19]  Tianqi Chen,et al.  A Complete Recipe for Stochastic Gradient MCMC , 2015, NIPS.

[20]  Alexandre M. Bayen,et al.  Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.

[21]  Lester W. Mackey,et al.  Measuring Sample Quality with Diffusions , 2016, The Annals of Applied Probability.

[22]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[23]  Fabrice Baudoin Wasserstein contraction properties for hypoelliptic diffusions , 2016, 1602.04177.

[24]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[25]  É. Moulines,et al.  Sampling from a strongly log-concave distribution with the Unadjusted Langevin Algorithm , 2016 .

[26]  M. Betancourt,et al.  The Geometric Foundations of Hamiltonian Monte Carlo , 2014, 1410.5110.

[27]  Matus Telgarsky,et al.  Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis , 2017, COLT.

[28]  Oren Mangoubi,et al.  Rapid Mixing of Hamiltonian Monte Carlo on Strongly Log-Concave Distributions , 2017, 1708.07114.

[29]  Santosh S. Vempala,et al.  Convergence rate of Riemannian Hamiltonian Monte Carlo and faster polytope volume computation , 2017, STOC.

[30]  Peter L. Bartlett,et al.  Convergence of Langevin MCMC in KL-divergence , 2017, ALT.

[31]  Arnak S. Dalalyan,et al.  User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient , 2017, Stochastic Processes and their Applications.

[32]  A. Eberle,et al.  Couplings and quantitative contraction rates for Langevin dynamics , 2017, The Annals of Probability.

[33]  Alain Durmus,et al.  High-dimensional Bayesian inference via the unadjusted Langevin algorithm , 2016, Bernoulli.