Job Scheduling Strategies for Parallel Processing

The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FCFS first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet utilization was affected little. In particular, these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.

[1]  Peter M. A. Sloot,et al.  Breaking the Curse of Dynamics by Task Migration: Pilot Experiments in the Polder Metacomputer , 1997, PVM/MPI.

[2]  Uwe Schwiegelshohn,et al.  Improving First-Come-First-Serve Job Scheduling by Gang Scheduling , 1998, JSSPP.

[3]  Alexander Reinefeld,et al.  The MOL project: an open, extensible metacomputer , 1997, Proceedings Sixth Heterogeneous Computing Workshop (HCW'97).

[4]  Larry Rudolph,et al.  Evaluation of Design Choices for Gang Scheduling Using Distributed Hierarchical Control , 1996, J. Parallel Distributed Comput..

[5]  Larry Rudolph,et al.  Metrics and Benchmarking for Parallel Job Scheduling , 1998, JSSPP.

[6]  Marios C. Papaefthymiou,et al.  A Gang Scheduling Design for Multiprogrammed Parallel Computing Environments , 1996, JSSPP.

[7]  Kenneth C. Sevcik,et al.  Implementing Multiprocessor Scheduling Disciplines , 1997, JSSPP.

[8]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[9]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[10]  Scott Pakin,et al.  Dynamic Coscheduling on Workstation Clusters , 1998, JSSPP.

[11]  Larry Rudolph,et al.  Distributed hierarchical control for parallel processing , 1990, Computer.

[12]  Dror G. Feitelson,et al.  Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860 , 1995, JSSPP.

[13]  Dror G. Feitelson,et al.  Improved Utilization and Responsiveness with Gang Scheduling , 1997, JSSPP.

[14]  Nawaf Bitar,et al.  A Scalable Multi-Discipline, Multiple-Processor Scheduling Framework for IRIX , 1995, JSSPP.

[15]  Mark S. Squillante,et al.  Extensible Resource Scheduling for Parallel Scientific Applications , 1997, PPSC.

[16]  Dror G. Feitelson,et al.  Packing Schemes for Gang Scheduling , 1996, JSSPP.

[17]  Steven Hotovy,et al.  Workload Evolution on the Cornell Theory Center IBM SP2 , 1996, JSSPP.

[18]  Markus Schwehm,et al.  Mapping and Scheduling by Genetic Algorithms , 1994, CONPAR.

[19]  Yutaka Ishikawa,et al.  Overhead Analysis of Preemptive Gang Scheduling , 1998, JSSPP.

[20]  Volker Sander,et al.  High-Performance Computer Management Based on JAVA , 1998, HPCN Europe.

[21]  Paul A. Fishwick,et al.  SimPack: getting started with simulation programming in C and C++ , 1992, WSC '92.

[22]  Paul A. Fishwick,et al.  Simulation model design and execution - building digital worlds , 1995 .

[23]  Mark S. Squillante,et al.  Extensible resource management for cluster computing , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[24]  Satish K. Tripathi,et al.  A Comparative Analysis of Static Processor Partitioning Policies for Parallel Computers , 1993, MASCOTS.

[25]  Larry Rudolph,et al.  Gang Scheduling Performance Benefits for Fine-Grain Synchronization , 1992, J. Parallel Distributed Comput..

[26]  Yutaka Ishikawa,et al.  Implementation of Gang-Scheduling on Workstation Cluster , 1996, JSSPP.

[27]  Morris A. Jette Expanding Symmetric Multiprocessor Capability Through Gang Scheduling , 1998, JSSPP.

[28]  Uwe Schwiegelshohn,et al.  Analysis of first-come-first-serve parallel job scheduling , 1998, SODA '98.

[29]  Raghu V. Hudli,et al.  CORBA fundamentals and programming , 1996 .

[30]  Walter Ludwig,et al.  Algorithms for scheduling malleable and nonmalleable parallel tasks , 1996, Technical Report / University of Wisconsin, Madison / Computer Sciences Department.

[31]  Uwe Schwiegelshohn Preemptive Weighted Completion Time Scheduling of Parallel Jobs , 1996, ESA.

[32]  Mary K. Vernon,et al.  Dynamic vs. Static Quantum-Based Parallel Processor Allocation , 1996, JSSPP.