Citation
Abstract
Missing values are one of the common incidences that occurs in healthcare datasets. Its existence usually leads to undesirable results while conducting data analysis using machine learning methods. Recently, researchers have proposed several imputation approaches to deal with missing values in real-world datasets. Moreover, data imputation assists us to build a high-performance machine learning models to discover patterns in healthcare data that provides top-notch insights for a higher quality decision-making. In this paper, we propose a new imputation approach using Extremely Randomized Trees (Extra Trees) of machine learning ensemble learning methods named (ExtraImpute) to tackle numerical missing values in healthcare context. The proposed method has the ability to impute both continuous and discrete data features. This approach imputes each missing value that exists in features by predicting its value using other observed values in the dataset. To evaluate the efficiency of our algorithm, several experiments are conducted on five different benchmark healthcare datasets and compared to other commonly used imputation methods, viz. missForest, KNNImpute, Multivariate Imputation by Chained Equations (MICE), and SoftImpute. The results were validated using Root Mean Square Error (RMSE) and Coefficient of Determination (R2) scores. From these results, it was observed that our proposed algorithm outperforms existing imputation techniques.
Download File
Full text not available from this repository.
Official URL or Download Paper: http://www.jait.us/index.php?m=content&c=index&a=s...
|
Additional Metadata
Item Type: | Article |
---|---|
Divisions: | Faculty of Computer Science and Information Technology |
DOI Number: | https://doi.org/10.12720/jait.13.5.470-476 |
Publisher: | Engineering and Technology Publishing |
Keywords: | Extra trees; Healthcare; Imputation; Missing values |
Depositing User: | Ms. Che Wa Zakaria |
Date Deposited: | 06 Oct 2023 23:13 |
Last Modified: | 06 Oct 2023 23:13 |
Altmetrics: | http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.12720/jait.13.5.470-476 |
URI: | http://psasir.upm.edu.my/id/eprint/101446 |
Statistic Details: | View Download Statistic |
Actions (login required)
View Item |