Dynamic Remapping of Parallel Computations with Varying Resource Demands

The issue of deciding when to invoke a global load remapping mechanism is studied. Such a decision policy must effectively weigh the costs of remapping against the performance benefits, and should be general enough to apply automatically to a wide range of computations. The authors propose a general mapping decision heuristic, then study its effectiveness and its anticipated behavior on two very different models of load evolution. Assuming only that the remapping cost is known, this policy dynamically minimizes system degradation (including the cost of remapping) for each computation step. This policy is quite simple, choosing to remap when the first local minimum in the degradation function is detected. Simulations show that the decision obtained provides significantly better performance than that achieved by never remapping. The authors also observe that the average intermapping frequency is quite close to the optimal fixed remapping frequency. >

[1]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[2]  John A. Stankovic,et al.  An Application of Bayesian Decision Theory to Decentralized Control of Job Scheduling , 1985, IEEE Transactions on Computers.

[3]  Gropp Dynamic grid manipulation for PDES (partial differential equations) on hypercube parallel processors. Research report , 1986 .

[4]  Ami Harten,et al.  Self adjusting grid methods for one-dimensional hyperbolic conservation laws☆ , 1983 .

[5]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[6]  M. Berger,et al.  Adaptive mesh refinement for hyperbolic partial differential equations , 1982 .

[7]  Andrew B. Whinston,et al.  On Optimal Allocation in a Distributed Processing Environment , 1982 .

[8]  Joel H. Saltz,et al.  Statistical methodologies for the control of dynamic remapping , 1986 .

[9]  Jeffrey R. Spirn,et al.  Program Behavior: Models and Measurements , 1977 .

[10]  A. Leonard Vortex methods for flow simulation , 1980 .

[11]  Donald F. Towsley,et al.  Queuing Network Models with State-Dependent Routing , 1980, JACM.

[12]  Shahid H. Bokhari,et al.  Partitioning Problems in Parallel, Pipelined, and Distributed Computing , 1988, IEEE Trans. Computers.

[13]  David M. Nicol,et al.  The automated partitioning of simulations for parallel execution , 1985 .

[14]  Geoffrey C. Fox,et al.  Concurrent computation and the theory of complex systems , 1986 .

[15]  Eric G. Manning,et al.  Synchronization of Distributed Simulation Using Broadcast Algorithms , 1979, Comput. Networks.

[16]  Peter J. Denning,et al.  Working Sets Past and Present , 1980, IEEE Transactions on Software Engineering.

[17]  Paul F. Reynolds A shared resource algorithm for distributed simulation , 1982, ISCA 1982.

[18]  David R. Jefferson,et al.  Virtual time , 1985, ICPP.

[19]  M. A. Iqbal,et al.  Performance tradeoffs in static and dynamic load balancing strategies , 1986 .

[20]  Marina C. Chen,et al.  Automated Problem Mapping: the Crystal Runtime System. , 1987 .

[21]  R. Smith,et al.  Performance Analysis of Strategies for Moving Mesh Control , 1984, Int. CMG Conference.

[22]  Earll M. Murman,et al.  Embedded mesh solutions of the Euler equation using a multiple-grid method , 1983 .

[23]  Joel H. Saltz,et al.  Automated problem scheduling and reduction of synchronization delay effects , 1987 .

[24]  Krithi Ramamritham,et al.  Evaluation of a flexible task scheduling algorithm for distributed hard real-time systems , 1985, IEEE Transactions on Computers.

[25]  Chong-Wei Xu,et al.  A Distributed Drafting Algorithm for Load Balancing , 1985, IEEE Transactions on Software Engineering.

[26]  M. Berger,et al.  Automatic adaptive grid refinement for the Euler equations , 1985 .

[27]  Christopher R. Anderson,et al.  On Vortex Methods , 1985 .

[28]  Achi Brandt,et al.  Local mesh refinement multilevel techniques , 1987 .

[29]  David M. Nicol,et al.  An optimal repartitioning decision policy , 1985, WSC '85.

[30]  G. S. Fishman Principles of Discrete Event Simulation , 1978 .

[31]  Wesley W. Chu,et al.  Task Allocation in Distributed Data Processing , 1980, Computer.

[32]  Arturo I. Concepcion Distributed simulation on multiprocessors: specification, design, and architecture (discrete event, computer) , 1985 .

[33]  Dan Gusfield,et al.  Parametric Combinatorial Computing and a Problem of Program Module Distribution , 1983, JACM.

[34]  D. M. Nicol,et al.  Dynamic remapping decisions in multi-phase parallel computations. Final report , 1986 .

[35]  S. McCormick,et al.  The fast adaptive composite grid (FAC) method for elliptic equation , 1986 .

[36]  Randolph E. Bank,et al.  A MULTI-LEVEL ITERATIVE METHOD FOR NONLINEAR ELLIPTIC EQUATIONS , 1981 .

[37]  Asser N. Tantawi,et al.  Optimal static load balancing in distributed computer systems , 1985, JACM.

[38]  Harold S. Stone,et al.  Critical Load Factors in Two-Processor Distributed Systems , 1978, IEEE Transactions on Software Engineering.

[39]  William Gropp,et al.  Local Uniform Mesh Refinement with Moving Grids , 1987 .

[40]  Erol Gelenbe,et al.  On the Optimum Checkpoint Interval , 1979, JACM.

[41]  William Gropp,et al.  Local uniform mesh refinement on loosely-coupled parallel processors☆ , 1988 .

[42]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[43]  S. B. Baden DYNAMIC LOAD BALANCING OF A VORTEX CALCULATION RUNNING ON MULTIPROCESSORS , 1986 .

[44]  D. Brandt,et al.  Multi-level adaptive solutions to boundary-value problems math comptr , 1977 .