Scheduling Large Jobs by Abstraction Refinement

Thomas A. Henzinger, Vasu Singh, Thomas Wies, and Damien Zufferey

The static scheduling problem often arises as a fundamental problem in real-time systems and grid computing. We consider the problem of statically scheduling a large job expressed as a task graph on a large number of computing nodes, such as a data center. This paper solves the large-scale static scheduling problem using abstraction refinement, a technique commonly used in formal verification to efficiently solve computationally hard problems. A scheduler based on abstraction refinement first attempts to solve the scheduling problem with abstract representations of the job and the computing resources. As abstract representations are generally small, the scheduling can be done reasonably fast. If the obtained schedule does not meet specified quality conditions (like data center utilization or schedule makespan), then the scheduler refines the job and data center abstractions and, again solves the scheduling problem. We develop different schedulers based on abstraction refinement. We implemented these schedulers and used them to schedule task graphs from various computing domains on simulated data centers with realistic topologies.We compare the speed of scheduling and the quality of the produced schedules with our abstraction refinement schedulers against a baseline scheduler that does not use any abstraction.We conclude that abstraction refinement techniques give a significant speedup, for a reasonable cost in the quality of the scheduling.

EuroSYS, ACM Press, 2011.

Download inofficial, sometimes updated PDF document.