UPM Institutional Repository

Bayesian logistic regression model on risk factors of type 2 diabetes mellitus


Chiaka, Emenyonu Sandra (2016) Bayesian logistic regression model on risk factors of type 2 diabetes mellitus. Masters thesis, Universiti Putra Malaysia.


Logistic regression model has long been known and it is commonly used in analysing a binary outcome or dependent variable and connects the binary dependent variable to several independent variables. Estimates of the coefficients for the variables are obtained via the method of maximum likelihood based on the frequentist point of view. However, Bayesian analysis allows the incorporation of the prior information and the coefficients of the logistic regression model are estimated by assuming prior distribution for each of the coefficient of interest, which then combines with the likelihood function for the posterior distribution to be obtained. The Bayesian logistic regression methods made use of the metropolis hasting (Random walk algorithm) and the Gibbs sampler with the incorporation of non-informative flat prior and non-informative non-flat prior distributions to obtain the posterior distribution for each coefficient of the variables. Although we incorporated the flat prior distribution, it has been shown to be widely used in different fields of study. However, this work also incorporated a non-flat prior, which is our main research and to the best of our knowledge has not been incorporated on any T2DM dataset in Malaysia. This study evaluates the risk factors such as age, ethnicity, gender, physical activity, hypertension, body mass index, family history of diabetes and waist circumference. The coefficients of the variables mentioned above were estimated by the method of maximum likelihood and significant variables were further identified. The significant variables determined by maximum likelihood method were then estimated using the BLR method. The BLR approach via Gibbs sampler and the random walk metropolis algorithm suggests that family history of diabetes, waist circumference and the body mass index are the significant risk factors associated with the type 2 diabetes mellitus. The model results also show a slight decrease in the posterior standard deviation associated with the parameters generated from the Bayesian analysis with the non-flat prior distribution compared to the results generated from the Bayesian analysis incorporating the non-informative prior. Having seen that the difference between the models is not much, consequently from all indications, all the models are good and they exhibited model fit.

Download File

[img] Text
FS 2016 45 UPM IR.pdf

Download (1MB)

Additional Metadata

Item Type: Thesis (Masters)
Subject: Logistic regression analysis
Subject: Diabetes - Statistical methods
Subject: Bayesian statistical decision theory
Call Number: FS 2016 45
Chairman Supervisor: Associate professor Mohd Bakri Adam, PhD
Divisions: Institute for Mathematical Research
Depositing User: Ms. Nur Faseha Mohd Kadim
Date Deposited: 08 Sep 2021 00:45
Last Modified: 11 Mar 2022 01:28
URI: http://psasir.upm.edu.my/id/eprint/69118
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item