UPM Institutional Repository

Effect of missing value methods on Bayesian network classification of hepatitis data


Citation

Nazziwa Aisha, and Adam, Mohd. Bakri and Shohaimi, Shamarina (2013) Effect of missing value methods on Bayesian network classification of hepatitis data. International Journal of Computer Science and Telecommunications, 4 (6). pp. 8-12. ISSN 2047-3338

Abstract

Missing value imputation methods are widely used in solving missing value problems during statistical analysis. For classification tasks, these imputation methods can affect the accuracy of the Bayesian network classifiers. This paper study’s the effect of missing value treatment on the prediction accuracy of four Bayesian network classifiers used to predict death in acute chronic Hepatitis patients. Missing data was imputed using nine methods which include, replacing with most common attribute,support vector machine imputation (SVMI), K-nearest neighbor (KNNI), Fuzzy K-means Clustering (FKMI), K-means Clustering Imputation (KMI), Weighted imputation with K-Nearest Neighbor (WKNNI), regularized expectation maximization (EM), singular value decomposition (SVDI), and local least squares imputation (LLSI). The classification accuracy of the naive Bayes (NB), tree augmented naive Bayes (TAN), boosted augmented naive Bayes (BAN) and general Bayes network classifiers (GBN)were recorded. The SVMI and LLSI methods improved the classification accuracy of the classifiers. The method of ignoring missing values was better than seven of the imputation methods. Among the classifiers, the TAN achieved the best average classification accuracy of 86.3% followed by BAN with 85.1%.


Download File

[img]
Preview
PDF (Abstract)
Effect of missing value methods on Bayesian network classification of hepatitis data.pdf

Download (184kB) | Preview
Official URL or Download Paper: http://www.ijcst.org/Volume4/Issue6/

Additional Metadata

Item Type: Article
Divisions: Faculty of Science
Publisher: Sysbase Solution
Keywords: Bayesian network classifiers; Missing data; Imputation; Hepatitis dataset; Classification and data mining.
Depositing User: Umikalthom Abdullah
Date Deposited: 27 Aug 2014 01:07
Last Modified: 07 Dec 2015 03:52
URI: http://psasir.upm.edu.my/id/eprint/30217
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item