UPM Institutional Repository

Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form


Citation

Sidi, Fatimah (2007) Transformation of Extracted Knowledge in Malay Unstructured Documents Into an Interrogative Structured Form. Doctoral thesis, Universiti Putra Malaysia.

Abstract

The availability of knowledge discovery operation helps to extract valuable information and knowledge in large volumes of data in structured databases. However, a large portion of the available information is not in structured form but rather collections of text documents in unstructured format, which also implies to Malay unstructured documents. Therefore, structuring characteristics must be imposed to unstructured documents in order to transform information available in unstructured documents into knowledge. A new approach has been established to transform extracted knowledge in Malay unstructured document by identifying, organizing, and structuring them into interrogative structured form. Its architecture is developed based on the implementation of (i) interrogative knowledge identification; (ii) interrogative contextual information; and (iii) interrogative knowledge organization and structuring with Malay knowledge representation by concepts. It utilizes the Malay language corpus; interrogative theory; as well as object-oriented, ontology, and database model. The research involves system development based on architecture of the MalaylK-Ontology, which is being measured by quantitative retrieval performance using the recall and precision metrics. The development of the Retrieval lnterrogative Ontology Analysis Application is used to verify fitness of task for the functionalities and usefulness on the utilization of interrogative contextual information with color coding supplement, additional information annotation, and Malay knowledge representation by concepts. A number of experiments are carried out to quantify the accuracy of knowledge extracted. The MalaylK-Ontology is tested by using stratified random sampling drawn from various sources of Malay unstructured documents such as news, e-mails, articles, magazines, and texts from children story books. The results of the experiments have proved that the approach of MalaylK-Ontology performed well as compared to knowledge extracted manually done by an expert. The results of questionnaires evaluation on the Retrieval lnterrogative Ontology Analysis Application have shown good achievement in understanding the main point of the unstructured document easily and clearly. This is to improve better understanding the process of making sense of information into knowledge, maintaining the meaning of the information and gaining the interpretation of the identical knowledge in unstructured document which facilitate identical knowledge perceived by different people.


Download File

[img] Text
FSKTM_2007_10 IR.pdf

Download (3MB)

Additional Metadata

Item Type: Thesis (Doctoral)
Subject: Knowledge acquisition (Expert systems)
Subject: Databases
Call Number: FSKTM 2007 10
Chairman Supervisor: Associate Professor Hj. Mohd Hasan Selamat
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Nur Izyan Mohd Zaki
Date Deposited: 05 May 2010 07:52
Last Modified: 20 Jan 2022 07:29
URI: http://psasir.upm.edu.my/id/eprint/5887
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item