UPM Institutional Repository

An efficient relational to column oriented database schema transformation technique


Citation

Zaidi, Norwini (2019) An efficient relational to column oriented database schema transformation technique. Masters thesis, Universiti Putra Malaysia.

Abstract

NoSQL database is introduced to overcome the high demand of managing database management systems in addition to the need for managing huge amount of data in unstructured format. Thus, data migration has become an important process in database management to migrate relational database to NoSQL database due to the limitations in managing relational database. Schema transformation is an important process in data migration and there are various techniques that have been proposed to improve schema transformation and data migration from the relational database to the NoSQL database. The most common technique of schema transformation to NoSQL database is denormalization. However, schema transformation using denormalization suffers in terms of unnecessary data duplication in the NoSQL database that increases storage size. Furthermore, NoSQL database also has its limitations in terms of table joining and unable to perform queries on multiple tables. Schema transformation techniques using nested table merging describes only two related tables to merge. This inefficient schema transformation techniques lead to querying to be done on multiple tables and cause high query processing time. This research proposed a schema transformation technique for migrating data from relational database to column oriented database. The schema transformation technique has three main steps which are denormalization with read pattern, nested and multiple nested table merging, and rowkey design to reduce data redundancy and storage size to produce efficient query performance. In this technique, the read pattern identifies the access key of the query. The nested and multiple nested table merging techniques combined the tables that have the same access key to be in a nested form. The nested and multiple nested table merging on column oriented database leads the query to be performed on a single table to retrieve the data and thus improved query performance. Meanwhile, the rowkey design helps to determine the rowkey based on access keys that are identified in the read pattern technique. The experimental results showed that the proposed schema transformation technique managed to reduce data redundancy by eight column families thus reducing the storage size by 13.83% and improve the query performance time by 29.28% for DELL DVD dataset. While by using the Employees dataset, the proposed technique managed to reduce data redundancy by five column families thus reducing the storage size by 15.67% and improve the query performance time by 29.13%.


Download File

[img] Text
FSKTM 2020 1 IR.pdf

Download (1MB)

Additional Metadata

Item Type: Thesis (Masters)
Subject: Non-relational databases
Subject: SQL (Computer program language)
Subject: Database management
Call Number: FSKTM 2020 1
Chairman Supervisor: Iskandar Ishak, PhD
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Mas Norain Hashim
Date Deposited: 03 Sep 2021 01:06
Last Modified: 03 Sep 2021 01:06
URI: http://psasir.upm.edu.my/id/eprint/90672
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item