Keyword Search:

Bookmark and Share

Document enrichment using semantic tags for effective XML retrieval

Abubakar, Roko and C. Doraisamy, Shyamala and Azman, Azreen and Jantan, Azrul Hazri (2013) Document enrichment using semantic tags for effective XML retrieval. International Journal of Advancements in Computing Technology, 5 (13). pp. 138-146. ISSN 2005-8039

[img] PDF (Abstract)


Using XML to mark up document contents with user-defined and self descriptive terms makes XML technology as one of the most widely used technology for information representation and exchanges over the Internet. As a result many documents are now represented and stored as XML documents on the web. Therefore, there is the need to develop precise, efficient and user-friendly search techniques. The existing systems that support Content Only (CO) queries can be categorized into three. The Lowest Common Ancestor (LCA)-based, Query structuring systems and document Structure based systems. The answers return by first group of systems are either irrelevant to user search intention or may not be meaningful or informative enough because of the restriction on the choice of the root node. The other group requires mostly the existence of data scheme for its query conversion which is not always available or complex and fast evolving. Most of the existing systems put their emphases on query side. In this paper, we focus on document side instead of query side. Our approach exploits document structure; we enriched Wikipedia XML documents text with annotated semantic tags presence in the document. The effect of enriching elements’ text content is investigated through three retrieval experiments for which only the text content of document collection differ. The results of the experiments revealed that enriching elements’ text content with the semantic tags could improve the effectiveness of CO queries.

Item Type:Article
Keyword:Content-Only Query (CO); Content and Structure Only Query (CAS); XML retrieval; Annotated semantic tag
Faculty or Institute:Faculty of Computer Science and Information Technology
Publisher:Advanced Institute of Convergence Information Technology (AICIT)
ID Code:30605
Deposited By: Nida Hidayati Ghazali
Deposited On:06 Feb 2015 16:02
Last Modified:09 Sep 2015 14:07

Repository Staff Only: Edit item detail

Document Download Statistics

This item has been downloaded for since 06 Feb 2015 16:02.

View statistics for "Document enrichment using semantic tags for effective XML retrieval"