A Network Embedding-Enhanced Approach for Generalized Community Detection

Community detection is one of the most important tasks in network analysis. Many community detection methods have been proposed recently. However, they typically focus on assortative community structures (i.e. nodes within the same community have more connections), while ignoring the diversity of community patterns in real world. In addition, the network topology, which these methods are mainly based on, is often noisy and very sparse. These two issues bring difficulties to existing methods for accurately finding communities. To address these problems, we propose a new probabilistic generative model. In this model, we first use an idea of mixture modeling to describe network regularities, and then introduce network embeddings to further enhance the ability of this model to describe network communities. Based on these, the new model will not only find generalized communities (e.g. assortative communities, disassortative communities, and their mixture), but also be robust for community detection in complicated situations (e.g. on very sparse networks with large noise). We present an efficient expectation-maximization (EM) algorithm to learn the model. Finally, we demonstrate the superior performance of our new approach over some state-of-the-art methods on both synthetic and real networks, and also validate its robustness to the above issues via a case study analysis.

[1]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[2]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[3]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Mark E. J. Newman,et al.  Generalized communities in networks , 2015, Physical review letters.

[5]  Dit-Yan Yeung,et al.  Overlapping community detection via bounded nonnegative matrix tri-factorization , 2012, KDD.

[6]  Weixiong Zhang,et al.  Modeling with Node Degree Preservation Can Accurately Find Communities , 2015, AAAI.

[7]  Dong Zhou,et al.  Translation techniques in cross-language information retrieval , 2012, CSUR.

[8]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[9]  Pin-Yu Chen,et al.  Revisiting Spectral Graph Clustering with Generative Community Models , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[10]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Kathryn Roeder,et al.  Global spectral clustering in dynamic networks , 2018, Proceedings of the National Academy of Sciences.

[12]  Lin Gao,et al.  Defining and identifying cograph communities in complex networks , 2015 .

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[15]  Weixiong Zhang,et al.  Joint Identification of Network Communities and Semantics via Integrative Modeling of Network Topologies and Node Contents , 2017, AAAI.

[16]  Mark E. J. Newman,et al.  Structural inference for uncertain networks , 2015, Physical review. E.

[17]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[18]  Xiaochun Cao,et al.  Modularity Based Community Detection with Deep Learning , 2016, IJCAI.

[19]  Xuelong Li,et al.  Constrained Nonnegative Matrix Factorization for Image Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jianwu Dang,et al.  Robust Detection of Link Communities in Large Social Networks by Exploiting Link Semantics , 2018, AAAI.