Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization

The rapid growth of Internet and modern technologies has brought data involving objects of multiple types that are related to each other, called as multi-type relational data. Traditional clustering methods for single-type data rarely work well on them, which calls for more advanced clustering techniques to deal with multiple types of data simultaneously to utilize their interrelatedness. A major challenge in developing simultaneous clustering methods is how to effectively use all available information contained in a multi-type relational data set including inter-type and intra-type relationships. In this paper, we propose a Symmetric Nonnegative Matrix Tri-Factorization (S-NMTF) framework to cluster multi-type relational data at the same time. The proposed S-NMTF approach employs NMTF to simultaneously cluster different types of data using their inter-type relationships, and incorporate the intra-type information through manifold regularization. In order to deal with the symmetric usage of the factor matrix in S-NMTF, we present a new generic matrix inequality to derive the solution algorithm, which involves a fourth-order matrix polynomial, in a principled way. Promising experimental results have validated the proposed approach.

[1]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[2]  Chris H. Q. Ding,et al.  Bipartite graph partitioning and data clustering , 2001, CIKM '01.

[3]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[4]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[5]  Zhang Changshui,et al.  Reply networks on a bulletin board system. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Chris H. Q. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering , 2005, SDM.

[7]  Philip S. Yu,et al.  Spectral clustering for multi-type relational data , 2006, ICML.

[8]  Zheng Chen,et al.  Latent semantic analysis for multiple-type interrelated data objects , 2006, SIGIR.

[9]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[10]  L. Rosasco,et al.  Manifold Regularization , 2007 .

[11]  Fei Wang,et al.  Semi-Supervised Clustering via Matrix Factorization , 2008, SDM.

[12]  Quanquan Gu,et al.  Co-clustering on manifolds , 2009, KDD.

[13]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[15]  Feiping Nie,et al.  Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization , 2011, SIGIR.

[16]  Feiping Nie,et al.  Dyadic transfer learning for cross-domain image classification , 2011, 2011 International Conference on Computer Vision.