Partitioning of unstructured meshes for load balancing

Many large-scale engineering and scientific calculations involve repaired updating of variables on an unstructured mesh. To do these types of computations on distributed memory parallel computers, it is necessary to partition the mesh among the processors so that the load balance is maximized and inter-processor communication time is minimized. This can be approximated by the problem of partitioning a graph so as to obtain a minimum cut, a well-studied combinatorial optimization problem. Graph partitioning is NP complete, so for real world applications, one resorts to heuristics, i.e., algorithms that give good but not necessarily optimum solutions. These algorithms include local search method such as Kerninghan-Lin, recursive spectral bisection, and more general purpose methods such as simulated annealing. We show that a general procedure enables us to combine simulating annealing with Kerninghan-Lin. The resulting algorithm is both very fast and extremely effective.

[1]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[2]  Maurice Hanan,et al.  A review of the placement and quadratic assignment problems , 1972 .

[3]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[4]  R. M. Mattheyses,et al.  A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.

[5]  Charles M. Fiduccia,et al.  A linear-time heuristic for improving network partitions , 1988, 25 years of DAC.

[6]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[7]  Shahid H. Bokhari,et al.  A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.

[8]  F. Barahona,et al.  On the magnetisation of the ground states in two dimensional Ising spin glasses , 1988 .

[9]  Cecilia R. Aragon,et al.  Optimization by Simulated Annealing: An Experimental Evaluation; Part I, Graph Partitioning , 1989, Oper. Res..

[10]  Frank Thomson Leighton,et al.  Improving the Performance of the Kernighan-Lin and Simulated Annealing Graph Bisection Algorithms , 1989, 26th ACM/IEEE Design Automation Conference.

[11]  Charbel Farhat On the mapping of massively parallel processors onto finite element graphs , 1989 .

[12]  John E. Savage,et al.  On Parallelizing Graph-Partitioning Heuristics , 1990, ICALP.

[13]  Roy D. Williams,et al.  Performance of dynamic load balancing algorithms for unstructured mesh calculations , 1991, Concurr. Pract. Exp..

[14]  Edward W. Felten,et al.  Large-Step Markov Chains for the Traveling Salesman Problem , 1991, Complex Syst..

[15]  Jack Dongarra,et al.  Heterogeneous network computing , 1991 .

[16]  Pinaki Mazumder,et al.  VLSI cell placement techniques , 1991, CSUR.

[17]  Edward W. Felten,et al.  Large-step markov chains for the TSP incorporating local search heuristics , 1992, Oper. Res. Lett..

[18]  Nashat Mansour,et al.  Physical optimization algorithms for mapping data to distributed-memory multiprocessors , 1992 .

[19]  Jack Dongarra,et al.  Integrated Pvm Framework Supports Heterogeneous Network Computing , 1993 .

[20]  Bruce Hendrickson,et al.  The Chaco user`s guide. Version 1.0 , 1993 .

[21]  Horst D. Simon,et al.  Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems , 1994, Concurr. Pract. Exp..

[22]  Bruce Hendrickson,et al.  An empirical study of static load balancing algorithms , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[23]  Olivier C. Martin,et al.  Combining simulated annealing with local search heuristics , 1993, Ann. Oper. Res..