Scheduling Inverse Trees Under the Communication Model of the LogP-Machine

Abstract Existing scheduling strategies for task graphs mainly assume machine models that ignore properties of existing parallel architectures. The overhead on the processors for communication and the bandwidth of the interconnection network are neglected. The LogP-machine better reflects these properties. Much about scheduling task graphs is known, if the overhead (o) and the bandwidth per processor (1/g) are ignored and only latencies are considered. Then for some classes of task graphs it is possible that an optimal schedule can be computed in polynomial time (e.g. coarse grained trees), while for other classes (e.g. fine grained trees) this problem is NP-hard. The aim of this article is to extend the results onto the LogP-machine. Restricting us to linear schedules (i.e. no two independent tasks are scheduled on the same processor) we show that for inverse tree-like task graphs (which include inverse trees) optimal linear schedules can be found in polynomial time when g t - o is constant, and the minimal computation time of a task is at least g t - o (no matter whether the trees are coarse grained or not). The same result holds for optimal linear restricted schedules, where a schedule is restricted if for each task at least one of its direct predecessors (if it exists) is scheduled on the same processor. On the other hand we show that it is an NP-complete problem to find optimal (restricted) schedules for inverse trees even when g = 0.

[1]  Mihalis Yannakakis,et al.  Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..

[2]  Mihalis Yannakakis,et al.  Towards an architecture-independent analysis of parallel algorithms , 1990, STOC '88.

[3]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[4]  J. Van Leeuwen,et al.  Handbook of Theoretical Computer Science , 1990 .

[5]  Ronald L. Rivest,et al.  The Design and Analysis of Computer Algorithms , 1990 .

[6]  Welf Löwe,et al.  On Design and Implementation of Parallel Algorithms for Solving Inverse Problems , 1996 .

[7]  Richard P. Martin,et al.  Fast parallel sorting under logp: from theory to practice , 1993 .

[8]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[9]  Welf Löwe,et al.  An Approach to Machine-Independent Parallel Programming , 1994, CONPAR.

[10]  Paul G. Spirakis,et al.  Lower bounds and efficient algorithms for multiprocessor scheduling of dags with communication delays , 1989, SPAA '89.

[11]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[12]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[13]  Eugene L. Lawler,et al.  Optimal Sequencing of a Single Machine Subject to Precedence Constraints , 1973 .

[14]  Han Hoogeveen,et al.  Three, four, five, six, or the complexity of scheduling with communication delays , 1994, Oper. Res. Lett..

[15]  Jayesh Siddhiwala,et al.  Path-Based Task Replication for Scheduling with Communication Costs , 1995, ICPP.

[16]  Tao Yang,et al.  On the Granularity and Clustering of Directed Acyclic Task Graphs , 1993, IEEE Trans. Parallel Distributed Syst..

[17]  Richard M. Karp,et al.  Optimal broadcast and summation in the LogP model , 1993, SPAA '93.

[18]  Welf Löwe,et al.  Upper time bounds for executing PRAM-programs on the LogP-machine , 1995, ICS '95.

[19]  Paul G. Spirakis,et al.  Lower Bounds and Efficient Algorithms for Multiprocessor Scheduling of Directed Acyclic Graphs with Communication Delays , 1993, Inf. Comput..

[20]  Mary Mehrnoosh Eshaghian-Wilner,et al.  Mapping Arbitrary Non-Uniform Task Graphs onto Arbitrary Non-Uniform System Graphs , 1995, ICPP.

[21]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .

[22]  Frank D. Anger,et al.  Scheduling with Sufficient Loosely Coupled Processors , 1990, J. Parallel Distributed Comput..

[23]  P. Chrétienne A polynomial algorithm to optimally schedule tasks on a virtual distributed system under tree-like precedence constraints , 1989 .

[24]  Beniamino Di Martino,et al.  Parallelization of Non-Simultaneous Iterative Methods for Systems of Linear Equations , 1994, CONPAR.

[25]  Jan Karel Lenstra,et al.  The Complexity of Scheduling Trees with Communication Delays , 1996, J. Algorithms.