Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu’s multi-way spectral clustering framework (CVPR 2005) and Ng et al.’s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  Pietro Perona,et al.  Beyond pairwise clustering , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Allen Y. Yang,et al.  Estimation of Subspace Arrangements with Applications in Modeling and Segmenting Mixed Data , 2008, SIAM Rev..

[4]  Guillermo Sapiro,et al.  Translated Poisson Mixture Model for Stratification Learning , 2008, International Journal of Computer Vision.

[5]  Nanda Kambhatla,et al.  Fast Non-Linear Dimension Reduction , 1993, NIPS.

[6]  Gilad Lerman,et al.  Least squares approximations for probability measures via multi-way curvatures , 2010 .

[7]  Mi-Suen Lee,et al.  A Computational Framework for Segmentation and Grouping , 2000 .

[8]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[9]  Kenichi Kanatani,et al.  Evaluation and Selection of Models for Motion Segmentation , 2001, ECCV.

[10]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[11]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.

[12]  P. Tseng Nearest q-Flat to m Points , 2000 .

[13]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[14]  Tamir Hazan,et al.  Multi-way Clustering Using Super-Symmetric Non-negative Tensor Factorization , 2006, ECCV.

[15]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Venu Madhav Govindu,et al.  A tensor decomposition for geometric grouping and segmentation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Gilles Blanchard,et al.  On the Convergence of Eigenspaces in Kernel Principal Component Analysis , 2005, NIPS.

[18]  Guillermo Sapiro,et al.  Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds , 2006, NIPS.

[19]  Marc Pollefeys,et al.  A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate , 2006, ECCV.

[20]  Achi Brandt,et al.  Fast multiscale clustering and manifold identification , 2006, Pattern Recognit..

[21]  G. Lerman,et al.  High-Dimensional Menger-Type Curvatures - Part I: Geometric Multipoles and Multiscale Inequalities , 2008, 0805.1425.

[22]  Tamara G. Kolda,et al.  Categories and Subject Descriptors: G.4 [Mathematics of Computing]: Mathematical Software— , 2022 .

[23]  Allen Y. Yang,et al.  Robust Statistical Estimation and Segmentation of Multiple Subspaces , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[24]  Michael I. Jordan,et al.  Mixtures of Probabilistic Principal Component Analyzers , 2001 .

[25]  John Wright,et al.  Segmentation of multivariate mixed data via lossy coding and compression , 2007, Electronic Imaging.

[26]  Tamara G. Kolda,et al.  MATLAB tensor classes for fast algorithm prototyping. , 2004 .

[27]  Paul S. Bradley,et al.  k-Plane Clustering , 2000, J. Glob. Optim..

[28]  Gilad Lerman,et al.  High-Dimensional Menger-Type Curvatures—Part II: d-Separation and a Menagerie of Curvatures , 2008, 0809.0137.

[29]  Xiaoming Huo,et al.  Near-optimal detection of geometric objects by fast multiscale methods , 2005, IEEE Transactions on Information Theory.

[30]  Gilad Lerman,et al.  On d-dimensional d-semimetrics and simplex-type inequalities for high-dimensional sine functions , 2008, J. Approx. Theory.

[31]  P. Torr Geometric motion segmentation and model selection , 1998, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[32]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[33]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[34]  Takeo Kanade,et al.  A Multibody Factorization Method for Independently Moving Objects , 1998, International Journal of Computer Vision.

[35]  John Wright,et al.  Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Kun Huang,et al.  A unifying theorem for spectral embedding and clustering , 2003, AISTATS.

[38]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[39]  Kenichi Kanatani,et al.  Motion segmentation by subspace separation and model selection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[40]  Robert Pless,et al.  Manifold clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[41]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[43]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[44]  Serge J. Belongie,et al.  Higher order learning with graphs , 2006, ICML.

[45]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[46]  Fabian J. Theis,et al.  Grassmann clustering , 2006, 2006 14th European Signal Processing Conference.

[47]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..