Citation
Abdullah, Muhamad Taufik and Ahmad, Fatimah and Mahmod, Ramlan and Tengku Sembok, Tengku Mohd
(2005)
Improvement of Malay information retrieval using local stop words.
In: International Advanced Technology Congress: Conference on Computer Integrated Systems, 6-8 Dec. 2005, Putrajaya, Malaysia. .
Abstract
This paper concerns an experiment on Malay information retrieval system using local stop words lists. We extract potential candidates for stop words list from Malay Quranic text collection and a list of existing stop words as the preliminary list. All words from the Quranic documents are extracted and ranked by frequency of occurrence in decreasing order, then the list of the 50 most frequently occurring words in the corpus are obtained. The evaluation of the new Malay stop words list is carried out on the Quranic collection. The employment of the new stop words in which the words from the Quran are compared with the stored stop words lists, results in a sum of 39.7% to 43.6% remaining terms. An experiment was done to evaluate the performance of this stop words lists in terms of the recall and precision for Malay information retrieval system. The results of the experiment without the stop words list and with using stop words lists shown that the employment of the new stop words lists had increased the average precision by 24.8% to 35.0%. The results demonstrate that this list can successfully be used in the Malay information retrieval system.
Download File
|
PDF (Full text)
38975.pdf
- Published Version
Restricted to Repository staff only
Download (102kB)
|
|
Additional Metadata
Actions (login required)
|
View Item |