UPM Institutional Repository

A novel LightGBM model for Arabic spam detection integrated with XAI for enhanced explainability


Citation

Bouke, Mohamed Aly and Alramli, Omar Imhemed and Abdalhafid, Alsadg Ahmed Albadwi and Abdullah, Azizol (2026) A novel LightGBM model for Arabic spam detection integrated with XAI for enhanced explainability. Computers and Electrical Engineering, 133. art. no. 111032. pp. 1-24. ISSN 0045-7906

Abstract

Arabic spam detection presents technical challenges due to linguistic variability, feature sparsity, and the limited availability of interpretable classification systems. Several machine learning models lack intrinsic interpretability, which reduces transparency in decision-making. This study proposes a spam detection pipeline that combines LightGBM with an integrated explainability layer. Interpretability is incorporated directly into the evaluation process by embedding SHAP and LIME to quantify explanation behavior. The system targets characteristics of Arabic text data by operating in a compressed feature space. From an initial 100-dimensional TF-IDF representation, the top 20 statistically stable features are selected for the experimental design. Under this configuration, the model achieves 93.01 % accuracy, 94 % precision, 93 % recall, and an F1-score of 93 %, maintaining balanced precision and recall. Compared to baseline classifiers that exhibit asymmetric precision–recall behavior, the proposed model produces consistent classification performance on Arabic text. The explanation layer is evaluated using a coherence metric that measures rank agreement between intrinsic LightGBM feature importance and attribution scores. The resulting coherence values (SHAP = 0.726, LIME = 0.3806) indicate that SHAP explanations show higher rank agreement with the model’s internal feature ranking, while LIME captures more localized variation. The findings demonstrate that predictive performance is maintained under feature reduction and that explanation behavior can be evaluated quantitatively.


Download File

[img] Text
123346.pdf - Published Version
Restricted to Repository staff only

Download (4MB)

Additional Metadata

Item Type: Article
Subject: Control and Systems Engineering
Subject: Computer Science (all)
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.1016/j.compeleceng.2026.111032
Publisher: Elsevier
Keywords: Arabic spam detection; LightGBM; LIME; SHAP; Text classification; XAI
Depositing User: MS. HADIZAH NORDIN
Date Deposited: 13 Apr 2026 03:30
Last Modified: 13 Apr 2026 03:30
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1016/j.compeleceng.2026.111032
URI: http://psasir.upm.edu.my/id/eprint/123346
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item