The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering

The nonnegative matrix factorization (NMF) has been shown recently to be useful for clustering and various extensions and variations of NMF have been proposed recently. Despite significant research progress in this area, few attempts have been made to establish the connections between various factorization methods while highlighting their differences. In this paper we aim to provide a comprehensive study on matrix factorization for clustering. In particular, we present an overview and summary on various matrix factorization algorithms and theoretically analyze the relationships among them. Experiments are also conducted to empirically evaluate and compare various factorization methods. In addition, our study also answers several previously unaddressed yet important questions for matrix factorizations including the interpretation and normalization of cluster posterior and the benefits and evaluation of simultaneous clustering. We expect our study would provide good insights on matrix factorization research for clustering.

[1]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[2]  Gene H. Golub,et al.  Matrix computations , 1983 .

[3]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[4]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[5]  Daniel Baier,et al.  Two-Mode Overlapping Clustering With Applications to Simultaneous Benefit Segmentation and Market Structuring , 1997 .

[6]  Vipin Kumar,et al.  WebACE: a Web agent for document categorization and exploration , 1998, AGENTS '98.

[7]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[8]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[9]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[10]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[11]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[12]  Chris H. Q. Ding,et al.  Bipartite graph partitioning and data clustering , 2001, CIKM '01.

[13]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[14]  Vichi Maurizio Double k-means Clustering for Simultaneous Classification of Objects and Variables , 2001 .

[15]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16]  Jonathan Foote,et al.  Summarizing video using non-negative similarity matrix factorization , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[17]  Daniel D. Lee,et al.  Multiplicative Updates for Nonnegative Quadratic Programming in Support Vector Machines , 2002, NIPS.

[18]  Javier Trejos,et al.  Two-mode Partitioning: Review of Methods and Application of Tabu Search , 2002 .

[19]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[20]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Daniel Boley,et al.  Principal Direction Divisive Partitioning , 1998, Data Mining and Knowledge Discovery.

[23]  Michael W. Berry,et al.  Text Mining Using Non-Negative Matrix Factorizations , 2004, SDM.

[24]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[25]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[26]  Inderjit S. Dhillon,et al.  Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data , 2004, SDM.

[27]  Chris H. Q. Ding,et al.  Cluster Structure of K-means Clustering via Principal Component Analysis , 2004, PAKDD.

[28]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[29]  Philip S. Yu,et al.  Co-clustering by block value decomposition , 2005, KDD '05.

[30]  Efstratios Gallopoulos,et al.  CLSI: A Flexible Approximation Scheme from Clustered Term-Document Matrices , 2005, SDM.

[31]  C. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and K-means - Spectral Clustering , 2005 .

[32]  Chris H. Q. Ding,et al.  Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence Chi-Square Statistic, and a Hybrid Method , 2006, AAAI.

[33]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[34]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[35]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[36]  Golub Gene H. Et.Al Matrix Computations, 3rd Edition , 2007 .