Adaptive model for semantic question answering disambiguation over linked data

Citation

Sofian, Hazrina (2018) Adaptive model for semantic question answering disambiguation over linked data. Doctoral thesis, Universiti Putra Malaysia.

Abstract

Semantic Question Answering (SQA) accepts natural language question (NL) from users and presents the exact answer retrieved from the linked data. It requires three disambiguations which are NL question disambiguation, linked data environment disambiguation and multi-types of word disambiguation. Firstly, the NL disambiguation involves the disambiguation of three meta-mapping aspects which are the variation of question pattern, question complexity and linguistic terminologies of NL questions posed by users. Secondly, the linked data disambiguation involves the disambiguation of another four meta-mapping aspects which are the variation of datatype, resource heterogeneity, knowledge-based (KB) concept terminology and the variation of structure in the linked data. Thirdly, the word disambiguation involves the disambiguation between the linguistic terminology and the KB concept terminology. These three disambiguations are needed to be addressed simultaneously because through empirical study that had been carried out, this research has found that the Simple Protocol and RDF Query Language (SPARQL) components are determined by these seven meta-mapping aspects. Most existing researches modify the question, manually; select only certain patterns of NL questions or select only simple questions from the dataset. Moreover, certain processes are semi-automated as some SQAs rely heavily on pre-determined lexicon knowledge for word disambiguation or manually annotate mapping for the SPARQL query constructions. However, the manual or semi-automated process is unable to cater for new question patterns posed by users or to adapt the contents in the linked data that is ever-changing and incrementally growing. These motivate this research to firstly design the Adaptive-based Natural Language Disambiguation (ANLD) model which is integrated with the Linguistic-based SPARQL Translation Model (LBSTM), selective (Part of Speech Tagging) POS tag extraction technique, composition of syntactic representation technique and model matching technique to disambiguate NL questions. Next, this research designs the Adaptive-based Linked Data Structure Disambiguation (ALID) model that is executed if the output of the ANLD model is not able to retrieve answer from the linked data. ALID uses component-based approach and feedback loop approach to disambiguate linked data environment and to disambiguate the word ambiguity. Precision, recall and f-measure are used as performance metrics to evaluate the accuracy of the SPARQL queries which are the outputs of this research. The accuracy is evaluated by comparing the constructed SPARQL queries with the golden standard results provided by the dataset. These results illustrate that the adaptive models are able to perform the three SQA disambiguation abilities simultaneously without manual modification. These achievements empower autonomous processing of translating NL questions to the SPARQL queries that involves users with unpredictable style of question writings against the linked data that is incrementally growing in terms of size and complexity.

Download File

Text
FSKTM 2018 66 - IR.pdf
Download (2MB)

Additional Metadata

Item Type:	Thesis (Doctoral)
Subject:	Semantic Web
Subject:	Speech processing systems
Subject:	Semantic networks (Information theory)
Call Number:	FSKTM 2018 66
Chairman Supervisor:	Associate Professor Nurfadhlina Binti Mohd Sharef, PhD
Divisions:	Faculty of Computer Science and Information Technology
Depositing User:	Ms. Nur Faseha Mohd Kadim
Date Deposited:	11 Feb 2020 01:52
Last Modified:	11 Feb 2020 01:52
URI:	http://psasir.upm.edu.my/id/eprint/76956
Statistic Details:	View Download Statistic

Actions (login required)

View Item