Non-negative Tri-factor tensor decomposition with applications

Non-negative matrix factorization (NMF) mainly focuses on the hidden pattern discovery behind a series of vectors for two-way data. Here, we propose a tensor decomposition model Tri-ONTD to analyze three-way data. The model aims to discover the common characteristics of a series of matrices and at the same time identify the peculiarity of each matrix, thus enabling the discovery of the cluster structure in the data. In particular, the Tri-ONTD model performs adaptive dimension reduction for tensors as it integrates the subspace identification (i.e., the low-dimensional representation with a common basis for a set of matrices) and the clustering process into a single process. The Tri-ONTD model can also be regarded as an extension of the Tri-factor NMF model. We present the detailed optimization algorithm and also provide the convergence proof. Experimental results on real-world datasets demonstrate the effectiveness of our proposed method in author clustering, image clustering, and image reconstruction. In addition, the results of our proposed model have sparse and localized structures.

[1]  J. Leeuw,et al.  Principal component analysis of three-mode data by means of alternating least squares algorithms , 1980 .

[2]  GhoshJoydeep,et al.  Cluster ensembles --- a knowledge reuse framework for combining multiple partitions , 2003 .

[3]  Christian Bauckhage,et al.  Convex non-negative matrix factorization for massive datasets , 2011, Knowledge and Information Systems.

[4]  David E. Booth,et al.  Multi-Way Analysis: Applications in the Chemical Sciences , 2005, Technometrics.

[5]  Tao Li,et al.  The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[6]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[7]  C. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and K-means - Spectral Clustering , 2005 .

[8]  P. Paatero The Multilinear Engine—A Table-Driven, Least Squares Program for Solving Multilinear Problems, Including the n-Way Parallel Factor Analysis Model , 1999 .

[9]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[10]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[11]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[12]  Springer-Verlag London Limited Temporal relation co-clustering on directional social network and author-topic evolution , 2010 .

[13]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[14]  Gene H. Golub,et al.  Rank-One Approximation to High Order Tensors , 2001, SIAM J. Matrix Anal. Appl..

[15]  W. DeSarbo Gennclus: New models for general nonhierarchical clustering analysis , 1982 .

[16]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[17]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[18]  J. Carroll,et al.  K-means clustering in a low-dimensional Euclidean space , 1994 .

[19]  Chris H. Q. Ding,et al.  2-Dimensional Singular Value Decomposition for 2D Maps and Images , 2005, SDM.

[20]  Maurizio Vichi,et al.  Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units , 2005 .

[21]  Tamara G. Kolda,et al.  Orthogonal Tensor Decompositions , 2000, SIAM J. Matrix Anal. Appl..

[22]  H. Kiers,et al.  Factorial k-means analysis for two-way data , 2001 .

[23]  Tamara G. Kolda,et al.  Temporal Analysis of Semantic Graphs Using ASALSAN , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[24]  Tao Li,et al.  Document clustering via adaptive subspace iteration , 2004, SIGIR '04.

[25]  Demetri Terzopoulos,et al.  Multilinear Analysis of Image Ensembles: TensorFaces , 2002, ECCV.

[26]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[27]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[28]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[29]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[30]  Maurizio Vichi,et al.  Two-mode multi-partitioning , 2008, Comput. Stat. Data Anal..

[31]  Tao Li,et al.  Clustering based on matrix approximation: a unifying view , 2008, Knowledge and Information Systems.

[32]  Seungjin Choi,et al.  Nonnegative Tucker Decomposition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Bülent Yener,et al.  Unsupervised Multiway Data Analysis: A Literature Survey , 2009, IEEE Transactions on Knowledge and Data Engineering.

[35]  Gene H. Golub,et al.  Matrix computations , 1983 .

[36]  Wray L. Buntine,et al.  Is Multinomial PCA Multi-faceted Clustering or Dimensionality Reduction? , 2003, AISTATS.

[37]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[38]  Bülent Yener,et al.  Modeling and Multiway Analysis of Chatroom Tensors , 2005, ISI.

[39]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[40]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[42]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[43]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[44]  Wojtek J. Krzanowski,et al.  Projection Pursuit Clustering for Exploratory Data Analysis , 2003 .

[45]  Max Welling,et al.  Positive tensor factorization , 2001, Pattern Recognit. Lett..

[46]  Inderjit S. Dhillon,et al.  Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data , 2004, SDM.

[47]  Maurizio Vichi,et al.  Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches , 2007, J. Classif..

[48]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[49]  Wei Peng,et al.  On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis , 2010, Applied Intelligence.

[50]  Brett W. Bader,et al.  The TOPHITS Model for Higher-Order Web Link Analysis∗ , 2006 .

[51]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[52]  Harold A. Edgerton,et al.  The method of minimum variation for the combination of criteria , 1936 .

[53]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .