Incorporating Network Embedding into Markov Random Field for Better Community Detection

Recent research on community detection focuses on learning representations of nodes using different network embedding methods, and then feeding them as normal features to clustering algorithms. However, we find that though one may have good results by direct clustering based on such network embedding features, there is ample room for improvement. More seriously, in many real networks, some statisticallysignificant nodes which play pivotal roles are often divided into incorrect communities using network embedding methods. This is because while some distance measures are used to capture the spatial relationship between nodes by embedding, the nodes after mapping to feature vectors are essentially not coupled any more, losing important structural information. To address this problem, we propose a general Markov Random Field (MRF) framework to incorporate coupling in network embedding which allows better detecting network communities. By smartly utilizing properties of MRF, the new framework not only preserves the advantages of network embedding (e.g. low complexity, high parallelizability and applicability for traditional machine learning), but also alleviates its core drawback of inadequate representations of dependencies via making up the missing coupling relationships. Experiments on real networks show that our new approach improves the accuracy of existing embedding methods (e.g. Node2Vec, DeepWalk and MNMF), and corrects most wrongly-divided statistically-significant nodes, which makes network embedding essentially suitable for real community detection applications. The new approach also outperforms other state-of-the-art conventional community detection methods.

[1]  Xueqi Cheng,et al.  A Non-negative Symmetric Encoder-Decoder Approach for Community Detection , 2017, CIKM.

[2]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[3]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[4]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[5]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[6]  Dit-Yan Yeung,et al.  Overlapping community detection via bounded nonnegative matrix tri-factorization , 2012, KDD.

[7]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[8]  Kevin Chen-Chuan Chang,et al.  Learning Community Embedding with Community Detection and Node Embedding on Graphs , 2017, CIKM.

[9]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[10]  Mark E. J. Newman,et al.  Structural inference for uncertain networks , 2015, Physical review. E.

[11]  Jian Pei,et al.  Arbitrary-Order Proximity Preserved Network Embedding , 2018, KDD.

[12]  Jian Pei,et al.  A Survey on Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[13]  Xiaochun Cao,et al.  Modularity Based Community Detection with Deep Learning , 2016, IJCAI.

[14]  E. Todeva Networks , 2007 .

[15]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Weixiong Zhang,et al.  A Network-Specific Markov Random Field Approach to Community Detection , 2018, AAAI Conference on Artificial Intelligence.

[17]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[18]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[19]  Muriel Médard,et al.  Network deconvolution as a general method to distinguish direct dependencies in networks , 2013, Nature Biotechnology.

[20]  Weixiong Zhang,et al.  Modeling with Node Degree Preservation Can Accurately Find Communities , 2015, AAAI.