Citation
Alidin, Az Azrinudin
(2007)
Incorporation of Contextual Retrieval and Data Fusion Approach Towards Improving The Retrieval Precision.
Masters thesis, Universiti Putra Malaysia.
Abstract
Generally, the functionality of information retrieval (IR) could be divided into two
categories where one section deals with search and retrieval while the other
component concerns with the subject or content analysis. In the search and retrieval
part, the IR systems present a ranked list of relevant documents depending on the
user submitted query as the representation of the user's information need. The ranked
list given indicates the probability of the document is relevant to the query by
ordering the highest relevant document at the top position and so forth. However,
queries are often formulated with simplified short words, such as "Java". These
words are unable to summarise precisely the user's information need and its context,
i.e. "java, programming language" or "java, the island". Consequently, the user's
information need is not satisfied as the highest relevant document was not positioned
accordingly or too much relevant document was presented in the ranked list.
Besides, by using the simplified query made the context is not easily extractable, and
in recent years there has been much research interest in contextual retrieval. Likewise IR, contextual retrieval retrieved the relevant document by using the combination of
query, user context and search technology into a single framework. Furthermore, in
contextual retrieval, the user's context is exploited to differentiate the relevant
document that is useful at that time the requests occur.
On the other hand, in order to match the queries and the document representation,
different IR schemes were applied to calculate the probability. As a result, often
retrieval precision is different for differing IR schemes, where dissimilar lists of
relevant documents for the same query submitted are presented. Thus, data fusion
approach is implemented in the IR to overcome this complication where multiple
sources of results are combined. The implementation of data fusion approach in IR
involves the merging of retrieval result from different IR schemes into a single
unified ranked list that supposedly presents a list of high precisely relevant
document.
This study presents an approach to incorporate contextual retrieval and data hsion
by using a one-keyword query towards improving retrieval precision. The methods to
identify user context are categorised into four approaches; relevance feedback, user
profiles, word-sense disambiguation and knowledge engineering. In order to extract
user context and to model contextual retrieval, term-weighting scheme based on user
profiles and knowledge engineering approaches for Watson scheme and word-sense
disambiguation approach for Wordsieve scheme are implemented in this study. Five
randomly selected documents are selected and submitted to these schemes and the
user's context extracted is used to expand the initial query for retrieval process.In addition, the feasibility of adopting a data fusion approach was assessed in this
study by testing two preconditions; --the efficacy and dissimilarity tests for the IR
scheme candidates, as there is a possibility that the precision improvement may not
be accomplished. Two queries which are Java and Jaguar, expanded by using user's
context extracted by Watson and WordSieve are submitted and more than ten
thousand documents are collected as the data collection for conducting the
experiment. The performance of the experiment is evaluated by using three
assessments; precision recall graph, precision evaluation based on document ranked
and mean average precision. The data fusion experiment based on contextual
retrieval results has reveals significant improvement on retrieval precision where the
lowest percentage gained compared to the basic IR scheme is approximate to thirty
seven percent, ten percent improvement compared to Watson and fifthteen percent
improvement compared to WordSieve based on mean average precision calculation
Download File
Additional Metadata
Actions (login required)
|
View Item |