A Dynamical Systems Perspective on Nesterov Acceleration

We present a dynamical system framework for understanding Nesterov's accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.

[1]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[2]  Mohit Singh,et al.  A geometric alternative to Nesterov's accelerated gradient descent , 2015, ArXiv.

[3]  Juan Peypouquet,et al.  Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity , 2018, Math. Program..

[4]  E. Hairer,et al.  Geometric Numerical Integration , 2022, Oberwolfach Reports.

[5]  Benjamin Recht,et al.  Analysis and Design of Optimization Algorithms via Integral Quadratic Constraints , 2014, SIAM J. Optim..

[6]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[7]  Zeyuan Allen Zhu,et al.  Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent , 2014, ITCS.

[8]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[9]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[10]  Alexandre M. Bayen,et al.  Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.

[11]  Jelena Diakonikolas,et al.  The Approximate Duality Gap Technique: A Unified Theory of First-Order Methods , 2017, SIAM J. Optim..

[12]  Alexandre d'Aspremont,et al.  Integration Methods and Optimization Algorithms , 2017, NIPS.

[13]  W. Rudin Principles of mathematical analysis , 1964 .