PARA'04 State-of-the-Art
in Scientific Computing
June 20-23, 2004 (Home page)

Updated: 8 February 2004

A Novel Task Scheduling Algorithm for Distributed Heterogeneous Computing Systems

Guan-Joe Lai
National Tai-Chung Teachers College
Tai-Chung, Taiwan, R.O.C.
email: gjlai@mail.ntctc.edu.tw

This paper proposes a novel task scheduling algorithm for distributed heterogeneous computing environments. A distributed heterogeneous computing system generally consists of a heterogeneous suite of machines (i.e., workstations or PCs), high-speed networks, communication protocols, and programming environments. Although distributed heterogeneous computing systems offer significant advantages for high-performance computing, the effective use of heterogeneous resources still remains a major challenge. For effective utilization of diverse resources, an application could be partitioned into a set of tasks presented by an edge-weighted directed acyclic graph, such that each task could be scheduled to the best-suited machine to minimize the total completion time.

Many heuristics have focused on solving the NP-complete problem of efficiently scheduling tasks to heterogeneous computing systems. For example, the generalized Dynamic Level Scheduling (DLS) algorithm [1], the Heterogeneous Earliest Finish Time (HEFT) [2] and Critical Path on a Processor (CROP) technique [2]. However, most of these algorithms simply focus on computational aspects, so that the communication may become the system bottleneck as the computational power increases (particularly while executing applications with huge communication requirements). These strategies may perform poorly in distributed heterogeneous computing systems because of the heterogeneity of computational power and that of communication mechanisms.

A list-scheduling based algorithm, called the Dominant Tasks Scheduling (DTS) algorithm, is proposed here to exploit the potential of parallel processing, allowing for system heterogeneity and network bandwidth. In general, the list-scheduling approach assigns priorities to tasks and then allocates these tasks to machines in the order of their priorities to minimize a predefined cost function [3]. Most of them examine a ready task for scheduling only after all parents of the task have been scheduled. However, the operation of scheduling some ready tasks should not affect the strict reduction of the earliest starting time of the tasks in the critical path; otherwise, it may extend the overall parallel scheduling length. This paper proposes a condition to avoid the strict reduction problem, and furthermore a condition to exploit schedule holes. The schedule holes arise primarily because a task is scheduled after some other tasks with higher scheduling priorities, but could be scheduled before these tasks without affecting their earliest starting times. Due to the adoption of the above two conditions, the DTS algorithm could produce better schedules than those obtained by existing algorithms.

In this paper, experimental results are presented to verify the preceding theoretical claims. Three algorithms (i.e., DLS, HEFT and CROP) are applied to comparatively demonstrate the proposed algorithm's effectiveness. The scheduling performance of these algorithms is compared in different computation/communication ratio. In order to evaluate these algorithms, we apply six practical applications, including the fork tasks, the join tasks, the fork-join tasks, the FFT, the Gaussian elimination, and the LU decomposition. The experimental results show the superiority of the DTS algorithm. As the system heterogeneity increases, the performance improvement of the DTS algorithm is more outstanding than that of the proposed algorithms in the literature. Therefore, the proposed scheduling algorithm may be used in designing efficient parallel environments for those situations where the system heterogeneity is the system performance bottleneck.

Reference:
1. G.C. Shih and E.A. Lee, "A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 2, pp. 175-187, 1993.
2. H. Topcuoglu, S. Hariri, and M.Y. Wu, "Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing," IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 3, pp. 260-274, 2002.
3. Y. Kwok and I. Ahmad, "Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors," ACM Computing Surveys, Vol. 31, No. 4, December 1999, pp.406-471.

Home page


2004-02-08