Citation
Jayaramu, Veianthan
(2022)
Leptospirosis modelling using hydrometeorological indices and random forest machine learning for humid tropical north-east Peninsular Malaysia.
Masters thesis, Universiti Putra Malaysia.
Abstract
Leptospirosis is a zoonotic tropical disease caused by pathogenic Leptospira sp.
whose transmission has been linked to extreme hydrometeorological
phenomena. Hydrometeorological variability in the form of averages and
extremes indices have been used before as drivers in statistical prediction of
disease occurrence; however, their importance and predictive capacity are still
little known. Random forest classification models of leptospirosis occurrence
were developed to identify the important hydrometeorological indices and
models’ prediction accuracy, sensitivity, and specificity based on the sets of
indices used, using case data from three districts in Kelantan, Malaysia. This
region experiences annual monsoonal rainfall and flooding, and that record high
leptospirosis incidence rates. First, hydrometeorological data including rainfall,
streamflow, water level, relative humidity and temperature were derived into 164
weekly average and extreme indices in accordance with the Expert Team on
Climate Change Detection and Indices (ETCCDI). Then, the weekly number of
cases were classified into binary classes ‘high’ and ‘low’ based on an average
threshold. 17 models based on ‘average’, ‘extreme’ and ‘mixed’ sets of indices
– based on the type of indices used as input – were trained by optimizing the
feature subsets using the embedded approach that utilized the mean decrease
Gini (MDG) scores. The variable importance was assessed through cross
correlation analysis and the MDG scores. The results showed that the average
and extreme models showed similar prediction accuracy ranges while the mixed
models showed some improvement. An extreme model was the most sensitive
while and average model was the most specific. The time lag associated with
the driving indices agreed with the seasonality of the monsoon. The variable
importance analysis based on the MDG scores indicated that overall, the rainfall
(extreme) factor dominated, suggesting its strong influence on Leptospirosis
incidence while the streamflow variable was the least important to the model
development despite showing higher cross correlations with leptospirosis.
Download File
Additional Metadata
Actions (login required)
 |
View Item |