Citation
Yauri, Aliyu Rufai
(2014)
Automated semantic query formulation for Quranic verse translation retrieval.
PhD thesis, Universiti Putra Malaysia.
Abstract
With the exponential growth in the amount of data that is deposited on the web and in other data storage repositories daily, there is an increase in the global desire to retrieve that data in a more effective and efficient manner. There are quite a number of mechanisms through which this data is retrieved, such as search engines like Yahoo and Google among others, however most of the current information retrieval mechanisms on the web are based on a keyword search. A keyword search mostly retrieves information that is not relevant to the searched query due to problems such as semantic ambiguity of natural language. The user needs to know the exact keyword to use in order to retrieve the relevant information. To overcome this problem, several approaches have been researched, such as the query formulation, and most are based on a keyword and small fragment query.
In this thesis, a study of the automatic semantic query formulation of natural language query to structured query is proposed. The proposed system in this thesis is referred to as AutoSQuR, meaning Automated Semantic Quran Retrieval. The proposed AutoSQuR attempts to semantically formulate complex natural language queries to triple representation and retrieve relevant verses from Holy Quran.
The main contribution of this research is introduced a method to formulate semantic query automatically for natural language queries to structured queries using statistical machine learning technique. The contribution includes going beyond keywords and formulating small fragment queries to complex queries that can be a paragraph in length. Additionally the proposed system supports both categories of users who prefer suggestions from the system and those who prefer to reformulate their query in case the system fails to automatically formulate user queries. The proposed system provides suggestions to the user where either concepts are identified or not in the query. Another contribution is the use of ontology equivalent assertions due to the limitations of WordNet for the disambiguation of Islamic-related words.
Finally, an experimental evaluation of AutoSQuR is implemented. The evaluation was based on measuring the performance of the proposed statistical machine learning technique with the existing approach in FREyA in terms of the percentage of queries that are semantically formulated correctly, and the effectiveness of the retrieved Quran verses. Evaluation has shown that the proposed approach outperformed the existing approach in FREyA. The statistical machine learning technique has shown improvement of 17.4% increases in comparison with the existing approach in FREyA in terms of correctness of the query formulation. Meanwhile, in the effectiveness of the retrieved verse, the proposed approach shows an improvement of 0.06 in terms of precision and 0.1 for recall.
Download File
Additional Metadata
Actions (login required)
|
View Item |