Trade-offs between Communication Throughput and Parallel Time

We study the effect of limited communication throughput on parallel computation in a setting where the number of processors is much smaller than the length of the input. Our model haspprocessors that communicate through a shared memory of sizem. The input has sizenand can be read directly by all the processors. We will be primarily interested in studying cases wheren?p?m. As a test case we study the list reversal problem. For this problem we prove a time lower bound of?(n/mp). (A similar lower bound holds also for the problems of sorting, finding all unique elements, convolution, and universal hashing.) This result demonstrates that limiting the communication (i.e., smallm) could have significant effect on parallel computation. We show an almost matching upper bound ofO((n/mp)logO(1)n). The upper bound requires the development of a few interesting techniques which can alleviate the limited communication in some general settings. Specifically, we show how to emulate a large shared memory on a limited shared memory efficiently. The lower bound applies even to randomized machines, and the upper bound is a randomized algorithm. We also argue that some standard methodology for designing parallel algorithms appears to require a relatively high level of communication throughput. Our results suggest that new alternative methodologies that need a lower such level must be invented for parallel machines that enable a low level of communication throughput, since otherwise those machines will be severly handicapped as general purpose parallel machines. Although we do not rule that out, we cannot offer any encouraging evidence to suggest that such new methodologies are likely to be found.

[1]  Richard P. Brent,et al.  The Parallel Evaluation of General Arithmetic Expressions , 1974, JACM.

[2]  Leslie G. Valiant,et al.  Graph-Theoretic Arguments in Low-Level Complexity , 1977, MFCS.

[3]  Allan Borodin,et al.  A time-space tradeoff for sorting on a general sequential model of computation , 1980, STOC '80.

[4]  Martin E. Hellman,et al.  A cryptanalytic time-memory trade-off , 1980, IEEE Trans. Inf. Theory.

[5]  Uzi Vishkin,et al.  An O(n² log n) Parallel MAX-FLOW Algorithm , 1982, J. Algorithms.

[6]  Uzi Vishkin,et al.  Trade-offs between depth and width in parallel computation , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[7]  Yaacov Yesha,et al.  Time-Space Tradeoffs for Matrix Multiplication and the Discrete Fourier Transform on any General Sequential Random-Access Computer , 1984, J. Comput. Syst. Sci..

[8]  Karl R. Abrahamson Time-space tradeoffs for branching programs contrasted with those for straight-line programs , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[9]  W. Daniel Hillis,et al.  Data parallel algorithms , 1986, CACM.

[10]  Paul Beame,et al.  A general sequential time-space tradeoff for finding unique elements , 1989, STOC '89.

[11]  Ming Li,et al.  New lower bounds for parallel computation , 1989, JACM.

[12]  Alok Aggarwal,et al.  On communication latency in PRAM computations , 1989, SPAA '89.

[13]  Noam Nisan,et al.  The computational complexity of universal hashing , 1990, STOC '90.

[14]  Andrew Chi-Chih Yao,et al.  Coherent Functions and Program ( extended abstract ) Checkers , .

[15]  Wolfgang J. Paul,et al.  On the Cost-Effectiveness and Realization of the Theoretical PRAM Model , 1991 .

[16]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[17]  Yossi Azar Lower Bounds for Threshold and Symmetric Functions in Parallel Computation , 1992, SIAM J. Comput..

[18]  Frank Thomson Leighton,et al.  Methods for message routing in parallel machines , 1992, STOC '92.

[19]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[20]  Richard M. Karp,et al.  Parallel sorting with limited bandwidth , 1995, SPAA '95.

[21]  Yossi Matias,et al.  Modeling parallel bandwidth: local vs. global restrictions , 1997, SPAA '97.

[22]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.