UPM Institutional Repository

Quranic ontology for resolving query translation disambiguation in English-Malay cross-language information retrieval


Citation

Yahya, Zulaini (2012) Quranic ontology for resolving query translation disambiguation in English-Malay cross-language information retrieval. Masters thesis, Universiti Putra Malaysia.

Abstract

This research proposed a Cross Language Information Retrieval (CLIR)method based on specific domain/ontology using specific concepts for disambiguating translation of the query. This research experiment the use of specific domain/ontology: Quran, written in English and Malay languages as a bilingual parallel-corpora and specific concepts: Quran, as a resource for cross-language query translation along with dictionary-based translation. This study evaluates the effectiveness of query translation using dictionary based and ontology for CLIR system. For translation, we use two basic approaches as benchmark: 1) first translation listed in the dictionary; and 2)all translation candidates listed in the dictionary. For the proposed CLIR method, we use three approaches: 1) based on verse list; 2) based on concepts similarity; and 3) based on concepts expansion. For concepts matching before and after query translation, we used two approaches: 1)query concepts; and 2) translation concepts. The experimental result shows that retrieval performance using dictionary based is lower than monolingual either in English or Malay document collections. Direct translation involved in returning many possibility results which can affect the decreasing in document retrieval performance either in English or Malay document collections. For the proposed CLIR method, performance of CLIR query translation based on verse list approach, concepts similarity approach and concepts expansion approach, obtained a better result either using query concepts or translation concepts matching compared to dictionary-based for English document collections but not in Malay document collections. In Malay document collections the retrieval performance only improved in concepts expansion approach. English language has a better structure compared to Malay language which affects the retrieval performance. A single Malay word may have a variety of meaning, not only by the word itself but also depends on the meaning of the verse or chapter. This is one of the reasons why retrieval performance decreasing in Malay document collections.


Download File

[img]
Preview
PDF
FSKTM 2012 27R.pdf

Download (855kB) | Preview

Additional Metadata

Item Type: Thesis (Masters)
Subject: Cross-language information retrieval
Subject: Qurʼan - Translating
Subject: English language - Translating
Call Number: FSKTM 2012 27
Chairman Supervisor: Muhamad Taufik bin Abdullah, PhD
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Haridan Mohd Jais
Date Deposited: 04 Feb 2015 07:31
Last Modified: 04 Feb 2015 07:31
URI: http://psasir.upm.edu.my/id/eprint/31652
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item