UPM Institutional Repository

Optical character recognition for Quranic image similarity matching


Alotaibi, Faiz and Abdullah, Muhamad Taufik and Abdullah, Rusli and O. K. Rahmat, Rahmita Wirza and Hashem, Ibrahim Abaker Targio and Sangaiah, Arun Kumar (2017) Optical character recognition for Quranic image similarity matching. IEEE Access, 6. 554 - 562. ISSN 2169-3536


The detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human association. The Quranic text contains two elements, namely, diacritics and characters. However, processing these elements may cause malfunction to the OCR system and reduce its level of accuracy. In this paper, a new method is proposed to check the similarity and originality of Quranic content. This method is based on a combination of Quranic diacritic and character recognition techniques. Diacritic detections are performed using a region-based algorithm. An optimization technique is applied to increase the recognition ratio. Moreover, character recognition is performed based on the projection method. An optimization technique is applied to increase the recognition ratio. The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran. The obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature. The accuracies were 96.4286% and 92.3077% better in the improved knn algorithm for diacritics and characters, respectively, than in the knn algorithm.

Download File

[img] Text
Optical character recognition for Quranic image similarity matching.pdf
Restricted to Repository staff only

Download (4MB)
Official URL or Download Paper: https://ieeexplore.ieee.org/document/8101474

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.1109/ACCESS.2017.2771621
Publisher: Institute of Electrical and Electronics Engineers
Keywords: Image processing; Character recognition; Quranic diacritics; Knn; Optimization
Depositing User: Mr. Sazali Mohamad
Date Deposited: 26 Nov 2019 07:56
Last Modified: 26 Nov 2019 07:56
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=/10.10.1109/ACCESS.2017.2771621
URI: http://psasir.upm.edu.my/id/eprint/75148
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item