Citation
Nasharuddin, Nurul Amelina and Azman, Azreen and Abdullah, Muhamad Taufik and Abdul Kadir, Rabiah
(2019)
A framework for English and Malay cross-lingual document alignment method.
International Journal of Advanced Trends in Computer Science and Engineering, 8 (1.3).
pp. 190-195.
ISSN 2278-3091
Abstract
Issues of information divide in multilingual information
retrieval are usually being solved by translating users’ queries
to a language that the users understand. But dictionaries or
other translation knowledge in some of the Asian languages
are scarce. The objective of this study was to automatically
align the English and Malay news documents to become a
comparable corpus, which could contribute as a translation
resource to improve the query translation in cross-lingual
information retrieval. This study proposes a direct alignment
framework by utilizing the textual features similarity of each
document itself while attempting a novel approach of using
the similarity of the documents sentiment in improving the
effectiveness of the alignment method. The proposed
sentiment-based approach outperformed existing alignment
methods and improved the effectiveness in differentiating the
related and unrelated documents. These aligned comparable
documents can further be utilised in translation research for
the English and Malay cross-lingual information retrieval
tasks.
Download File
Additional Metadata
Actions (login required)
|
View Item |