UPM Institutional Repository

Leptospirosis modelling using hydrometeorological indices and random forest machine learning for humid tropical north-east Peninsular Malaysia


Citation

Jayaramu, Veianthan (2022) Leptospirosis modelling using hydrometeorological indices and random forest machine learning for humid tropical north-east Peninsular Malaysia. Masters thesis, Universiti Putra Malaysia.

Abstract

Leptospirosis is a zoonotic tropical disease caused by pathogenic Leptospira sp. whose transmission has been linked to extreme hydrometeorological phenomena. Hydrometeorological variability in the form of averages and extremes indices have been used before as drivers in statistical prediction of disease occurrence; however, their importance and predictive capacity are still little known. Random forest classification models of leptospirosis occurrence were developed to identify the important hydrometeorological indices and models’ prediction accuracy, sensitivity, and specificity based on the sets of indices used, using case data from three districts in Kelantan, Malaysia. This region experiences annual monsoonal rainfall and flooding, and that record high leptospirosis incidence rates. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity and temperature were derived into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, the weekly number of cases were classified into binary classes ‘high’ and ‘low’ based on an average threshold. 17 models based on ‘average’, ‘extreme’ and ‘mixed’ sets of indices – based on the type of indices used as input – were trained by optimizing the feature subsets using the embedded approach that utilized the mean decrease Gini (MDG) scores. The variable importance was assessed through cross correlation analysis and the MDG scores. The results showed that the average and extreme models showed similar prediction accuracy ranges while the mixed models showed some improvement. An extreme model was the most sensitive while and average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The variable importance analysis based on the MDG scores indicated that overall, the rainfall (extreme) factor dominated, suggesting its strong influence on Leptospirosis incidence while the streamflow variable was the least important to the model development despite showing higher cross correlations with leptospirosis.


Download File

[img] Text
115737.pdf

Download (937kB)
Official URL or Download Paper: http://ethesis.upm.edu.my/id/eprint/18246

Additional Metadata

Item Type: Thesis (Masters)
Subject: Hydrometeorology
Subject: Leptospirosis
Call Number: FK 2022 126
Chairman Supervisor: Zed Diyana binti Zulkafli, PhD
Divisions: Faculty of Engineering
Depositing User: Ms. Rohana Alias
Date Deposited: 13 Mar 2025 07:47
Last Modified: 13 Mar 2025 07:47
URI: http://psasir.upm.edu.my/id/eprint/115737
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item