Mining Large Dense Subgraphs

Several applications including community detection in social networks and discovering correlated genes involve finding large subgraphs of high density. We propose the problem of finding the largest subgraph of a given density. The problem is a generalization of the Max-Clique problem which seeks the largest subgraph that has an edge density of 1. We define an objective function and prove that its optimization results in the largest graph of given density. We propose an algorithm that finds the subgraph by running multiple local search heuristics with random restarts. For massive graphs, where running the algorithm directly may be intractable, we use a sampling technique that reduces the graph to a smaller one which is likely to contain large dense subgraphs. We evaluate our algorithm on multiple real life and synthetic datasets. Our experiments show that our algorithm performs as well as the state-of-the-art for finding large subgraphs of high density, while providing density guarantees.