Citation
Abduh Kaid, Monir Abdullah
(2009)
Load-Balancing Models for Scheduling Divisible Load on Large Scale Data Grids.
Doctoral thesis, Universiti Putra Malaysia.
Abstract
In many data grid applications, data can be decomposed into multiple independent
sub datasets and distributed for parallel execution. This property has been successfully
employed using Divisible Load Theory (DLT) , which has been proven to be a
powerful tool for modeling divisible load problems in large scale data grid. Load
balancing in such environment plays a critical role in achieving high utilization of
resources to schedule the applications efficiently through join consideration of communication
and computation time. There are some scheduling models, which have
been studied, such as Constraint DLT (CDLT), Task Data Present (TDP) and Genetic
Algorithm (GA). However, there has been no optimal solution reached. At the same
time, effective schedulers are not only required to minimize the maximum completion
time (makespan) of the jobs, but also the execution time of the schedulers.This thesis proposes several load balancing models for scheduling divisible load on
large scale data grids, when both processor and communication link speed are heterogeneous.
The proposed models can be decomposed into three stages. The first stage
is to develop new DLT based models for multiple sources scheduling. Closed form
solutions for the load allocation are derived. The new models are called Adaptive
DLT (ADLT) and A2DLT models. In the second stage, an Iterative DLT (IDLT)
model is proposed. Recursive numerical equations are derived to find the optimal
workload assigned to the grid node. The closed form solutions are derived for the
optimal load allocation. Although the IDLT model is proposed for single source, it
has been applied in the case of multiple sources. The third stage integrates the proposed
DLT based models with GA algorithm to solve the time consuming problem.
In addition, the integration of the proposed DLT model with Simulated Annealing
(SA) algorithm has been also developed.
The experimental results have proven that the proposed models yield better perform
ance than previous models in terms of makespan and scheduler execution time. The
ADLT and A2DLT models have reduced the makespan by 21% and 37% respectively
compared to CDLT model. The IDLT model is capable of producing almost optimal
solution for single source scheduling with low time complexity. In addition, the integration
of the proposed DLT model with GA and SA algorithms has also significantly
improved the performance. The SA is 64.70% better than GA in terms of makespan.
Thus, the proposed models can balance the processing loads efficiently so that they
can be integrated in the existing data grid schedulers to improve the performance.
Download File
Additional Metadata
Actions (login required)
|
View Item |