UPM Institutional Repository

Hybrid Reinforcement Learning-Active Learning framework for real-data augmentation in imbalanced credit scoring


Citation

Nazri, Azree and Agbolade, Olalekan and Aziz, Faisal (2025) Hybrid Reinforcement Learning-Active Learning framework for real-data augmentation in imbalanced credit scoring. IEEE Access, 13. pp. 191005-191023. ISSN 2169-3536

Abstract

Class imbalance is a critical challenge in credit scoring, where the dominance of majority class samples reduces predictive performance for minority instances. Traditional methods, such as SMOTE and random undersampling, attempt to rebalance datasets but often introduce synthetic noise or discard valuable data. This paper introduces Augmentation Based on Uncertainty and Difficulty (UDDA), a novel real-data augmentation framework that avoids both synthetic data generation and majority class removal. UDDA leverages a hybrid of Reinforcement Learning (RL) and Active Learning (AL) to identify and prioritize real, informative, and difficult-to-classify samples. This approach enhances model robustness while preserving dataset integrity. Experimental evaluation on twenty benchmark imbalanced tabular datasets demonstrates that UDDA outperforms established methods, including SMOTE and undersampling, across key metrics such as precision, recall, F1-score, and accuracy. UDDA sets a new direction in imbalanced learning by offering a practical, interpretable, and effective solution for improving classification performance in credit scoring applications.


Download File

[img] Text
124853.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB)
Official URL or Download Paper: https://ieeexplore.ieee.org/document/11153928/

Additional Metadata

Item Type: Article
Subject: Computer Science (all)
Subject: Materials Science (all)
Subject: Engineering (all)
Divisions: Faculty of Computer Science and Information Technology
Institute for Mathematical Research
DOI Number: https://doi.org/10.1109/access.2025.3608032
Publisher: Institute of Electrical and Electronics Engineers
Keywords: Active learning; Class imbalance; Real data; Reinforcement learning; Synthetic data
Sustainable Development Goals (SDGs): SDG 8: Decent Work and Economic Growth, SDG 9: Industry, Innovation and Infrastructure, SDG 10: Reduced Inequalities
Depositing User: Ms. Nur Faseha Mohd Kadim
Date Deposited: 24 Apr 2026 02:43
Last Modified: 24 Apr 2026 02:43
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1109/access.2025.3608032
URI: http://psasir.upm.edu.my/id/eprint/124853
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item