Are Tensor Decomposition Solutions Unique? On the Global Convergence HOSVD and ParaFac Algorithms

Matrix factorizations and tensor decompositions are now widely used in machine learning and data mining. They decompose input matrix and tensor data into matrix factors by optimizing a least square objective function using iterative updating algorithms, e.g. HOSVD (High Order Singular Value Decomposition) and ParaFac (Parallel Factors). One fundamental problem of these algorithms remains unsolved: are the solutions found by these algorithms global optimal? Surprisingly, we provide a positive answer for HSOVD and negative answer for ParaFac by combining theoretical analysis and experimental evidence. Our discoveries of this intrinsic property of HOSVD assure us that in real world applications HOSVD provides repeatable and reliable results.

[1]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[2]  G. Golub,et al.  A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies , 2007, Proceedings of the National Academy of Sciences.

[3]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[4]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[5]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Tamara G. Kolda,et al.  MATLAB Tensor Toolbox , 2006 .

[7]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[8]  Tamara G. Kolda,et al.  Orthogonal Tensor Decompositions , 2000, SIAM J. Matrix Anal. Appl..

[9]  Kohei Inoue,et al.  Equivalence of Non-Iterative Algorithms for Simultaneous Low Rank Approximations of Matrices , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Amnon Shashua,et al.  Linear image coding for regression and classification using the tensor-rank principle , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Dima Damen,et al.  Computer Vision and Pattern Recognition (CVPR) , 2009 .

[12]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Alejandro F. Frangi,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004 .

[14]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[15]  Chris H. Q. Ding,et al.  Tensor reduction error analysis — Applications to video compression and classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Feiping Nie,et al.  Extracting the optimal dimensionality for local tensor discriminant analysis , 2009, Pattern Recognit..

[17]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[18]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[19]  Christos Faloutsos,et al.  Quantifiable data mining using ratio rules , 2000, The VLDB Journal.

[20]  Heng Huang,et al.  Symmetric two dimensional linear discriminant analysis (2DLDA) , 2009, CVPR.

[21]  Gene H. Golub,et al.  Rank-One Approximation to High Order Tensors , 2001, SIAM J. Matrix Anal. Appl..

[22]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[23]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[24]  Jieping Ye,et al.  Generalized Low Rank Approximations of Matrices , 2004, Machine Learning.

[25]  Chris H. Q. Ding,et al.  Simultaneous tensor subspace selection and clustering: the equivalence of high order svd and k-means clustering , 2008, KDD.