Citation
Bala, Yahaya Zakariyau
(2024)
Integrated approach for improving cross-project software defect prediction performance.
Doctoral thesis, Universiti Putra Malaysia.
Abstract
This research addresses three critical challenges in cross-project defect
prediction (CPDP): distribution differences, redundant features, and model
overfitting. These issues often degrade prediction accuracy and robustness in
various domains. To tackle these challenges, this study proposes a holistic
approach named Transformation, Feature Selection, and Multi-learning
(TFSM). This research is divided into three objectives: firstly, to proposed
transformation, feature selection and multi-learning techniques that can
mitigate distribution differences between datasets, identify and eliminate
redundant features and combat model overfitting, respectively. Secondly, to
integrate these techniques into a TFSM and implement. Thirdly, to evaluate
each technique and the integrated approach. The research methodology
involves the formulation, implementation, and evaluation of each technique
individually and their integrated approach, TFSM. Experimental evaluations
are conducted using open-source software projects sourced from the open
source repository, with F1_score serving as the primary evaluation metric. Results from the experiments demonstrate significant improvements in
predictive performance. The transformation techniques effectively reduce
distribution differences, enhancing the model's ability to generalize across
diverse datasets. Feature selection methods successfully mitigate the negative
impact of redundant features, streamlining the learning process and improving
model interpretability. Additionally, the multi-learning approach proves
effective in reducing model overfitting by aggregating diverse model outputs.
When integrated into the TFSM approach, these techniques collectively
demonstrated a marked improvement in CPDP performance. The TFSM
approach leverages the strengths of each individual technique, resulting in a
synergistic effect that enhances the model’s predictive accuracy. This
approach addresses the multifaceted challenges inherent in CPDP, providing
a more reliable and effective solution for defect prediction in software projects.
This work contributes to the ongoing efforts in the software engineering
community to develop more accurate and reliable defect prediction models,
ultimately aiding in the development of higher-quality software. Future work will
focus on further refining these techniques and exploring their applicability to a
broader range of software projects and repositories.
Download File
Additional Metadata
Actions (login required)
 |
View Item |