A min-max cut algorithm for graph partitioning and data clustering

An important application of graph partitioning is data clustering using a graph model - the pairwise similarities between all data objects form a weighted graph adjacency matrix that contains all necessary information for clustering. In this paper, we propose a new algorithm for graph partitioning with an objective function that follows the min-max clustering principle. The relaxed version of the optimization of the min-max cut objective function leads to the Fiedler vector in spectral graph partitioning. Theoretical analyses of min-max cut indicate that it leads to balanced partitions, and lower bounds are derived. The min-max cut algorithm is tested on newsgroup data sets and is found to out-perform other current popular partitioning/clustering methods. The linkage-based refinements to the algorithm further improve the quality of clustering substantially. We also demonstrate that a linearized search order based on linkage differential is better than that based on the Fiedler vector, providing another effective partitioning method.

[1]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[2]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[3]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[4]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[5]  M. Fiedler A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory , 1975 .

[6]  Chung-Kuan Cheng,et al.  An improved two-way partitioning algorithm with stable performance [VLSI] , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Dirk Roose,et al.  An Improved Spectral Bisection Algorithm and its Application to Dynamic Load Balancing , 1995, EUROSIM International Conference.

[9]  Shang-Hua Teng,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[10]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[11]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Stephen Guattery,et al.  On the Quality of Spectral Separators , 1998, SIAM J. Matrix Anal. Appl..

[13]  Pankaj K. Agarwal,et al.  Exact and Approximation Algortihms for Clustering , 1997 .

[14]  Alan M. Frieze,et al.  Clustering in large graphs and matrices , 1999, SODA '99.

[15]  Madhav V. Marathe,et al.  Approximation Algorithms for Clustering to Minimize the Sum of Diameters , 2000, Nord. J. Comput..

[16]  Chris H. Q. Ding,et al.  Bipartite graph partitioning and data clustering , 2001, CIKM '01.

[17]  Chris H. Q. Ding,et al.  A spectral method to separate disconnected and nearly-disconnected web graph components , 2001, KDD '01.

[18]  Hongyuan Zha,et al.  Web document clustering using hyperlink structures , 2001 .