UPM Institutional Repository

A cluster-based hybrid replica control protocol for high availability in data grid


Citation

Mabni, Zulaile (2019) A cluster-based hybrid replica control protocol for high availability in data grid. Doctoral thesis, Universiti Putra Malaysia.

Abstract

Data Grid provides a scalable infrastructure for managing and storing large amount of data files in Grid computing system. In Data Grid, data replication is a widely used technique for managing data, where exact copies of data or replicas are created and stored at many distributed sites. This technique provides high data availability and increases the performance of the distributed systems. In recent years, the number of distributed nodes has become very large in Grid computing system. The growing number of nodes has raised few issues in data replication. The first issue is, nodes in the Grid systems are dynamic where they can join or leave the system at any time. Therefore, a replica control protocol must consider the dynamic aspects of the Data Grid. Next important issue is replica placement which determines the suitable nodes to place the replicas. Previously, replica placement has not been an issue since the research only focuses on small-scale systems. However, in a larger system such as Data Grid, the existing replica control protocols require bigger number of replicas to construct read and write quorums. As the number of replicas increases, the communication cost also increases and thus, degrades the performance of the protocols. Another issue is replica consistency that needs to be ensured when copying data in a large-scale system. In order to maintain replica consistency, if there is concurrent update to several replicas of the same file, then all other replicas must have the same updated contents. Thus, an efficient mechanism is needed to improve performance of the system while ensuring replica consistency in Data Grid. Therefore, in this thesis, we proposed a new replica control protocol named Cluster-Based Hybrid (CBH) protocol for large-scale system with the objectives to reduce the communication cost, increase data availability, and maintain replica consistency. CBH employs a hybrid replication strategy by combining the advantages of two common replica control protocols to improve the performance of the existing protocols. A clustering algorithm has been proposed to group the large nodes into clusters and organize these clusters into a tree structure. Another proposed algorithm is replica placement algorithm which selects and places only one replica in each cluster. The performance of CBH protocol is evaluated theoretically and using simulations. A discrete event simulator called GridSim and Java programming language is used to simulate the proposed protocol. The performance metrics which are communication cost and data availability of the protocol are evaluated and compared with two latest quorum-based protocols which are Dynamic Hybrid (DH) and Duplication on Grid (DDG) protocol. CBH shows that by grouping the nodes into clusters and having only one replica in each cluster, has minimized the number of replicas involved in constructing read and write quorums. This research has contributed a dynamic cluster-based hybrid replica control protocol which proposed a clustering algorithm to determine the number of clusters, a mechanism for dynamic participation of nodes in the network, and a replica placement algorithm that produces low communication cost and high data availability as compared to DH and DDG protocols. CBH has proven that replica consistency is maintained by satisfying the Quorum Intersection Properties.


Download File

[img] Text
FSKTM (fsktm) 2019 45.pdf

Download (899kB)

Additional Metadata

Item Type: Thesis (Doctoral)
Subject: Computational grids (Computer systems)
Call Number: FSKTM 2019 45
Chairman Supervisor: Associate Professor Dr Rohaya Latip
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Mas Norain Hashim
Date Deposited: 17 Feb 2021 03:38
Last Modified: 31 Dec 2021 08:24
URI: http://psasir.upm.edu.my/id/eprint/84549
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item