Citation
Zaidi, Norwini
(2019)
An efficient relational to column oriented database schema transformation technique.
Masters thesis, Universiti Putra Malaysia.
Abstract
NoSQL database is introduced to overcome the high demand of managing database
management systems in addition to the need for managing huge amount of data in
unstructured format. Thus, data migration has become an important process in database
management to migrate relational database to NoSQL database due to the limitations in
managing relational database. Schema transformation is an important process in data
migration and there are various techniques that have been proposed to improve schema
transformation and data migration from the relational database to the NoSQL database.
The most common technique of schema transformation to NoSQL database is
denormalization. However, schema transformation using denormalization suffers in
terms of unnecessary data duplication in the NoSQL database that increases storage size.
Furthermore, NoSQL database also has its limitations in terms of table joining and
unable to perform queries on multiple tables. Schema transformation techniques using
nested table merging describes only two related tables to merge. This inefficient schema
transformation techniques lead to querying to be done on multiple tables and cause high
query processing time.
This research proposed a schema transformation technique for migrating data from
relational database to column oriented database. The schema transformation technique
has three main steps which are denormalization with read pattern, nested and multiple
nested table merging, and rowkey design to reduce data redundancy and storage size to
produce efficient query performance. In this technique, the read pattern identifies the
access key of the query. The nested and multiple nested table merging techniques
combined the tables that have the same access key to be in a nested form. The nested and
multiple nested table merging on column oriented database leads the query to be
performed on a single table to retrieve the data and thus improved query performance.
Meanwhile, the rowkey design helps to determine the rowkey based on access keys that
are identified in the read pattern technique. The experimental results showed that the
proposed schema transformation technique managed to reduce data redundancy by eight column families thus reducing the storage size by 13.83% and improve the query
performance time by 29.28% for DELL DVD dataset. While by using the Employees
dataset, the proposed technique managed to reduce data redundancy by five column
families thus reducing the storage size by 15.67% and improve the query performance
time by 29.13%.
Download File
Additional Metadata
Actions (login required)
|
View Item |