On dissipative symplectic integration with applications to gradient-based optimization

Recently, continuous dynamical systems have proved useful in providing conceptual and quantitative insights into gradient-based optimization, widely used in modern machine learning and statistics. An important question that arises in this line of work is how to discretize the system in such a way that its stability and rates of convergence are preserved. In this paper we propose a geometric framework in which such discretizations can be realized systematically, enabling the derivation of "rate-matching" optimization algorithms without the need for a discrete convergence analysis. More specifically, we show that a generalization of symplectic integrators to dissipative Hamiltonian systems is able to preserve continuous rates of convergence up to a controlled error. Moreover, such methods preserve a perturbed Hamiltonian despite the absence of a conservation law, extending key results of symplectic integrators to dissipative cases. Our arguments rely on a combination of backward error analysis with fundamental results from symplectic geometry.

[1]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[2]  Aryan Mokhtari,et al.  Direct Runge-Kutta Discretization Achieves Acceleration , 2018, NeurIPS.

[3]  Daniel P. Robinson,et al.  ADMM and Accelerated ADMM as Continuous Dynamical Systems , 2018, ICML.

[4]  Brian E. Moore,et al.  Exponential integrators preserving local conservation laws of PDEs with time-dependent damping/driving forces , 2019, J. Comput. Appl. Math..

[5]  Arieh Iserles,et al.  Why Geometric Numerical Integration , 2018 .

[6]  Michael I. Jordan,et al.  A Dynamical Systems Perspective on Nesterov Acceleration , 2019, ICML.

[7]  Michael I. Jordan,et al.  Acceleration via Symplectic Discretization of High-Resolution Differential Equations , 2019, NeurIPS.

[8]  Xiaocheng Shang,et al.  Structure-preserving integrators for dissipative systems based on reversible– irreversible splitting , 2018, Proceedings of the Royal Society A.

[9]  Daniel P. Robinson,et al.  Conformal symplectic and relativistic optimization , 2019, NeurIPS.

[10]  Ernst Hairer,et al.  The life-span of backward error analysis for numerical integrators , 1997 .

[11]  S. Reich Backward Error Analysis for Numerical Integrators , 1999 .

[12]  S. Yau,et al.  Lectures on Differential Geometry , 1994 .

[13]  Brian E. Moore,et al.  Second Order Conformal Symplectic Schemes for Damped Hamiltonian Systems , 2016, J. Sci. Comput..

[14]  P J Fox,et al.  THE FOUNDATIONS OF MECHANICS. , 1918, Science.

[15]  Michael I. Jordan,et al.  Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives , 2020, J. Mach. Learn. Res..

[16]  Anthony J Leggett,et al.  Influence of Dissipation on Quantum Tunneling in Macroscopic Systems , 1981 .

[17]  G. Quispel,et al.  Geometric integrators for ODEs , 2006 .

[18]  L. Einkemmer Structure preserving numerical methods for the Vlasov equation , 2016, 1604.02616.

[19]  BhattAshish,et al.  Second Order Conformal Symplectic Schemes for Damped Hamiltonian Systems , 2016 .

[20]  Molei Tao,et al.  Explicit symplectic approximation of nonseparable Hamiltonians: Algorithm and long time performance. , 2016, Physical review. E.

[21]  Nicolò Cesa-Bianchi,et al.  Advances in Neural Information Processing Systems 31 , 2018, NIPS 2018.

[22]  J. M. Sanz-Serna,et al.  Symplectic integrators for Hamiltonian problems: an overview , 1992, Acta Numerica.

[23]  M. Suzuki,et al.  Fractal decomposition of exponential operators with applications to many-body theories and Monte Carlo simulations , 1990 .

[24]  Robert I. McLachlan Families of High-Order Composition Methods , 2004, Numerical Algorithms.

[25]  P. Caldirola,et al.  Forze non conservative nella meccanica quantistica , 1941 .

[26]  Michael I. Jordan,et al.  On Symplectic Optimization , 2018, 1802.03653.

[27]  E. Hairer,et al.  Geometric Numerical Integration , 2022, Oberwolfach Reports.

[28]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[29]  Michael I. Jordan,et al.  A Lyapunov Analysis of Momentum Methods in Optimization , 2016, ArXiv.

[30]  Kai Cieliebak,et al.  Symplectic Geometry , 1992, Acta Applicandae Mathematicae.

[31]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[32]  M. Razavy,et al.  On the quantization of dissipative systems , 1977 .

[33]  Stam Nicolis,et al.  Dynamic magnetostriction for antiferromagnets , 2019, Physical Review B.

[34]  Alexandre M. Bayen,et al.  Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.

[35]  Ernst Hairer,et al.  Simulating Hamiltonian dynamics , 2006, Math. Comput..

[36]  H. Yoshida Construction of higher order symplectic integrators , 1990 .

[37]  Brian E. Moore Multi-conformal-symplectic PDEs and discretizations , 2017, J. Comput. Appl. Math..

[38]  Michael I. Jordan,et al.  Generalized Momentum-Based Methods: A Hamiltonian Perspective , 2019, SIAM J. Optim..

[39]  Daniel P. Robinson,et al.  A Dynamical Systems Perspective on Nonsmooth Constrained Optimization , 2018, 1808.04048.

[40]  R. McLachlan,et al.  Conformal Hamiltonian systems , 2001 .

[41]  E. Hairer Backward analysis of numerical integrators and symplectic methods , 1994 .

[42]  Augustin Banyaga,et al.  An introduction to symplectic geometry , 1994 .

[43]  G. Benettin,et al.  On the Hamiltonian interpolation of near-to-the identity symplectic mappings with application to symplectic integration algorithms , 1994 .

[44]  Alexandre d'Aspremont,et al.  Integration Methods and Optimization Algorithms , 2017, NIPS.

[45]  V. Arnold,et al.  Mathematical aspects of classical and celestial mechanics , 1997 .