UPM Institutional Repository

Evaluation of missing values imputation methods towards the effectiveness of asset valuation prediction model


Mohd Jaya, Mohd Izham and Sidi, Fatimah and Affendey, Lilly Suriani and Ishak, Iskandar and A. Jabar, Marzanah (2019) Evaluation of missing values imputation methods towards the effectiveness of asset valuation prediction model. In: International Symposium on ICT Management and Administration (ISICTMA2019), 31 July-2 Aug. 2019, Putrajaya Marriott Hotel, Malaysia. (pp. 11-15).


Missing values is a common problem found in dataset from any field of research. A data value in a dataset can be missing due to numerous reasons such as non-response items in the interview and survey, equipment malfunction, human error and faulty data transmission. The occurrence of missing values in a dataset need to be managed using appropriate imputation methods to estimate the approximate values to replace the missing values. The problem of missing values also led to a data quality problem which then resulted inaccurate decisions. In this work, we compared and evaluated various imputation methods including deletion of records with missing value (DEL), mean values imputation (MEAN), k-Nearest Neighbor (KNN), Predictive Mean Matching (PMM), MissForest and Ontology-based Framework for Financial Decision Making (OFFDM) towards the effectiveness of asset valuation prediction model. In portfolio management, asset valuation prediction model is used to aid the decision making process. Additionally, we adopted MissForest method in the OFFDM which aim to improve the OFFDM. We conducted several experiments using different dataset derived from different imputation methods to measure the accuracy, Root Mean Squared Error (RMSE) and F-measure of the prediction model which being built in Artificial Neural Network (ANN). We found that dataset derived from DEL resulted the lowest accuracy and the highest RMSE. Whereas, the adoption of MissForest method in OFFDM resulted the highest accuracy and second lowest RMSE value. The selection of imputation methods is depended on the severity of the task in hands as each method is different in its complexity and efficiency. Imputation method such as MissForest is efficient but required more computational resources. On the other hand, simpler methods such as DEL is still popular due to its simplicity but less efficient.

Download File

[img] Text
Restricted to Repository staff only

Download (283kB)

Additional Metadata

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculty of Computer Science and Information Technology
Publisher: Database Technologies and Applications Research Group (DbTA), Faculty of Computer Science and Information Technology, Universiti Putra Malaysia
Keywords: Missing value; Imputation; Data quality
Depositing User: Nabilah Mustapa
Date Deposited: 07 Oct 2019 07:36
Last Modified: 07 Oct 2019 07:36
URI: http://psasir.upm.edu.my/id/eprint/75514
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item