Recent Advances in Stochastic Riemannian Optimization

Stochastic and finite-sum optimization problems are central to machine learning. Numerous specializations of these problems involve nonlinear constraints where the parameters of interest lie on a manifold. Consequently, stochastic manifold optimization algorithms have recently witnessed rapid growth, also in part due to their computational performance. This chapter outlines numerous stochastic optimization algorithms on manifolds, ranging from the basic stochastic gradient method to more advanced variance reduced stochastic methods. In particular, we present a unified summary of convergence results. Finally, we also provide several basic examples of these methods to machine learning problems, including learning parameters of Gaussians mixtures, principal component analysis, and Wasserstein barycenters.

[1]  Matthias Bethge,et al.  Data modeling with the elliptical gamma distribution , 2015, AISTATS.

[2]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[3]  P. Absil,et al.  Erratum to: ``Global rates of convergence for nonconvex optimization on manifolds'' , 2016, IMA Journal of Numerical Analysis.

[4]  Suvrit Sra,et al.  Matrix Manifold Optimization for Gaussian Mixtures , 2015, NIPS.

[5]  Frank Nielsen,et al.  Matrix Information Geometry , 2012 .

[6]  Silvere Bonnabel,et al.  Stochastic Gradient Descent on Riemannian Manifolds , 2011, IEEE Transactions on Automatic Control.

[7]  Christopher De Sa,et al.  Representation Tradeoffs for Hyperbolic Embeddings , 2018, ICML.

[8]  Wen Huang,et al.  A Broyden Class of Quasi-Newton Methods for Riemannian Optimization , 2015, SIAM J. Optim..

[9]  Bart Vandereycken,et al.  Low-Rank Matrix Completion by Riemannian Optimization , 2013, SIAM J. Optim..

[10]  Xin Gao,et al.  On Truly Block Eigensolvers via Riemannian Optimization , 2018, AISTATS.

[11]  Hiroyuki Kasai,et al.  Riemannian adaptive stochastic gradient algorithms on matrix manifolds , 2019, ICML.

[12]  Soumava Kumar Roy,et al.  Geometry Aware Constrained Optimization Techniques for Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[14]  Hiroyuki Kasai,et al.  Inexact trust-region algorithms on Riemannian manifolds , 2018, NeurIPS.

[15]  Hiroyuki Kasai,et al.  Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis , 2017, AISTATS.

[16]  Jie Liu,et al.  SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.

[17]  Bamdev Mishra,et al.  Riemannian Stochastic Recursive Gradient Algorithm , 2018, ICML 2018.

[18]  Lorenzo Rosasco,et al.  Manifold Structured Prediction , 2018, NeurIPS.

[19]  Hiroyuki Kasai,et al.  A Riemannian gossip approach to subspace learning on Grassmann manifold , 2017, Machine Learning.

[20]  Alexander J. Smola,et al.  Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.

[21]  Suvrit Sra,et al.  Conic Geometric Optimization on the Manifold of Positive Definite Matrices , 2013, SIAM J. Optim..

[22]  Robert E. Mahony,et al.  Optimization Algorithms on Matrix Manifolds , 2007 .

[23]  Mark W. Schmidt,et al.  MASAGA: A Linearly-Convergent Stochastic First-Order Method for Optimization on Manifolds , 2018, ECML/PKDD.

[24]  Gary Bécigneul,et al.  Riemannian Adaptive Optimization Methods , 2018, ICLR.

[25]  Marc Arnaudon,et al.  Medians and Means in Riemannian Geometry: Existence, Uniqueness and Computation , 2011, 1111.3120.

[26]  J. Jost Riemannian geometry and geometric analysis , 1995 .

[27]  Saeed Ghadimi,et al.  Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..

[28]  Tong Zhang,et al.  SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator , 2018, NeurIPS.

[29]  Suvrit Sra,et al.  An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization , 2017, Mathematical Programming.

[30]  Silvere Bonnabel,et al.  Linear Regression under Fixed-Rank Constraints: A Riemannian Approach , 2011, ICML.

[31]  R. Bhatia Positive Definite Matrices , 2007 .

[32]  Pan Zhou,et al.  Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Pierre-Antoine Absil,et al.  RTRMC: A Riemannian trust-region method for low-rank matrix completion , 2011, NIPS.

[34]  Hongyi Zhang,et al.  R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate , 2018, ArXiv.

[35]  Anthony Man-Cho So,et al.  Quadratic optimization with orthogonality constraint: explicit Łojasiewicz exponent and linear convergence of retraction-based line-search and stochastic variance-reduced gradient methods , 2018, Math. Program..

[36]  Suvrit Sra,et al.  Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[37]  Suvrit Sra,et al.  First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[38]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[39]  Anoop Cherian,et al.  Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices , 2015, IEEE Transactions on Neural Networks and Learning Systems.