Mixed Linear Regression with Multiple Components

In this paper, we study the mixed linear regression (MLR) problem, where the goal is to recover multiple underlying linear models from their unlabeled linear measurements. We propose a non-convex objective function which we show is {\em locally strongly convex} in the neighborhood of the ground truth. We use a tensor method for initialization so that the initial models are in the local strong convexity region. We then employ general convex optimization algorithms to minimize the objective function. To the best of our knowledge, our approach provides first exact recovery guarantees for the MLR problem with $K \geq 2$ components. Moreover, our method has near-optimal computational complexity $\tilde O (Nd)$ as well as near-optimal sample complexity $\tilde O (d)$ for {\em constant} $K$. Furthermore, we show that our non-convex formulation can be extended to solving the {\em subspace clustering} problem as well. In particular, when initialized within a small constant distance to the true subspaces, our method converges to the global optima (and recovers true subspaces) in time {\em linear} in the number of points. Furthermore, our empirical results indicate that even with random initialization, our approach converges to the global optima in linear time, providing speed-up of up to two orders of magnitude.

[1]  Michael Elad,et al.  Linear-Time Subspace Clustering via Bipartite Graph Modeling , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[3]  Peter Arbenz,et al.  Lecture Notes on Solving Large Scale Eigenvalue Problems , 2012 .

[4]  Percy Liang,et al.  Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[5]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[6]  Sham M. Kakade,et al.  Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.

[7]  Constantine Caramanis,et al.  Greedy Subspace Clustering , 2014, NIPS.

[8]  Ping Li,et al.  Online Low-Rank Subspace Clustering by Basis Dictionary Pursuit , 2015, ICML.

[9]  Aswin C. Sankaranarayanan,et al.  Greedy feature selection for subspace clustering , 2013, J. Mach. Learn. Res..

[10]  E. Vieth Fitting piecewise linear regression functions to biological responses. , 1989, Journal of applied physiology.

[11]  Padhraic Smyth,et al.  Trajectory clustering with mixtures of regression models , 1999, KDD '99.

[12]  Anima Anandkumar,et al.  Provable Tensor Methods for Learning Mixtures of Generalized Linear Models , 2014, AISTATS.

[13]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[14]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[15]  Constantine Caramanis,et al.  A Convex Formulation for Mixed Regression: Near Optimal Rates in the Face of Noise , 2013, ArXiv.

[16]  Martin A. Tanner,et al.  Mixtures-of-experts of autoregressive time series: asymptotic normality and model specification , 2005, IEEE Transactions on Neural Networks.

[17]  Marco Muselli,et al.  A New Learning Method for Piecewise Linear Regression , 2002, ICANN.

[18]  Wen Gao,et al.  Locally Linear Regression for Pose-Invariant Face Recognition , 2007, IEEE Transactions on Image Processing.

[19]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[20]  Helmut Bölcskei,et al.  Subspace clustering via thresholding and spectral clustering , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[22]  Kert Viele,et al.  Modeling with Mixtures of Linear Regressions , 2002, Stat. Comput..

[23]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Jiahua Chen,et al.  Variable Selection in Finite Mixture of Regression Models , 2007 .

[25]  Sham M. Kakade,et al.  A tail inequality for quadratic forms of subgaussian random vectors , 2011, ArXiv.

[26]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[27]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[29]  P. Deb,et al.  Estimates of use and costs of behavioural health care: a comparison of standard and finite mixture models. , 2000, Health economics.

[30]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[31]  Constantine Caramanis,et al.  Alternating Minimization for Mixed Linear Regression , 2013, ICML.