A framework for extracting, classifying, analyzing, and presenting information from semi-structured web data sources

Shaker, Mahmoud and Ibrahim, Hamidah and Mustapha, Aida and Abdullah, Lili Nurliyana (2010) A framework for extracting, classifying, analyzing, and presenting information from semi-structured web data sources. Journal of Next Generation Information Technology, 1 (3). pp. 106-114. ISSN 2092-8637

Full text not available from this repository.

Abstract

Extracting information from the web data sources becomes very important because the massive and increasing amount of diverse semi-structured information sources in the Internet that are available to users, and the variety of web pages making the process of information extraction from web a challenging problem. This paper proposes a framework for extracting, classifying, analyzing, and presenting semi-structured web data sources. The framework is able to extract relevant information from different web data sources, and classify the extracted information based on the standard classification scheme of Nokia products, which has been chosen as the case study.

Item Type:Article
Keyword:Information Extraction, Semi-Structured, Web Data Sources
Subject:Information storage and retrieval systems.
Subject:Natural language processing (Computer science)
Subject:Interactive computer systems.
Faculty or Institute:Faculty of Computer Science and Information Technology
ID Code:12693
Deposited By: Umikalthom Abdullah
Deposited On:24 Nov 2011 04:50
Last Modified:24 Nov 2011 04:50

Repository Staff Only: Edit item detail


Universiti Putra Malaysia Institutional Repository

Universiti Putra Malaysia Institutional Repository is an on-line digital archive that serves as a central collection and storage of scientific information and research at the Universiti Putra Malaysia.

Currently, the collections deposited in the IR consists of Master and PhD theses, Master and PhD Project Report, Journal Articles, Journal Bulletins, Conference Papers, UPM News, Newspaper Cuttings, Patents and Inaugural Lectures.

As the policy of the university does not permit users to view thesis in full text, access is only given to the first 24 pages only.