UPM Institutional Repository

Feature selection methods based on meteorological data for prediction of leptospirosis occurrence in Seremban, Malaysia


Citation

Rahmat, Mohamad Fariq (2019) Feature selection methods based on meteorological data for prediction of leptospirosis occurrence in Seremban, Malaysia. Masters thesis, Universiti Putra Malaysia.

Abstract

The use of predictive model is useful for preventing and controlling disease out-break. This can be done by analysing weather behavior in relation to disease occurrence. In Malaysia, leptospirosis disease is the one of the higher number of cases that reported for past 7 years, and the absence of understanding and modelling studies that allows development of an early warning system. In this study, predictive model is developed using machine learning to capture the relation between weather variables such as temperature, sum of rainfall, and relative humidity, and Leptospira occurrence. The aim of this study is to predict the occurrence of Leptospirosis in Seremban district using a machine learning and meteorological data as input. The first objective of the study is to investigate the best time lags for each weather variable using feature selection methods. The second objective is to develop, train and test a neural network model for disease prediction based on the selected features. Feature selection was conducted using two methods: firstly, though correlation analysis, and secondly through graphical and non-graphical Exploratory Data Analysis (EDA). The neural network model is developed using Backpropagation training, optimizing the number of hidden layers and hidden nodes. The success is measured using accuracy, sensitivity, and specificity of the model. Correlation analysis has shown that Seremban district has higher correlation with disease occurrence when sum of rainfall at lag 4 until 16 weeks and temperature at lag 1 week, while by using EDA has shown Seremban can have high correlation with leptospirosis occurrence when the temperature at lag 16 weeks and sum of rainfall at lag 12 until 20 weeks. This study also shown the predictive model can achieve high accuracy between 80% to 84% when the input variables were following the feature selection that have been made by EDA and the number of hidden neurons is 10. In conclusion, this study is able to show the trend of the environmental variable in predicting the leptospirosis occurrence at different time lag. Besides, by having this predictive model, it helps the public health not only to predict the occurrence of the disease, but it can prevent from the outbreak start to spread to the community by giving the early warning based on the weather status in future.


Download File

[img] Text
MOHAMAD FARIQ BIN RAHMAT - IR.pdf

Download (925kB)

Additional Metadata

Item Type: Thesis (Masters)
Subject: Imaging systems in meteorology
Subject: Meteorological instruments - Malaysia
Subject: Leptospirosis
Call Number: FK 2020 113
Chairman Supervisor: Asnor Juraiza Bt. Ishak, PhD
Divisions: Faculty of Engineering
Depositing User: Ms. Rohana Alias
Date Deposited: 25 Jul 2023 01:58
Last Modified: 25 Jul 2023 01:58
URI: http://psasir.upm.edu.my/id/eprint/104249
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item