UPM Institutional Repository

Bayesian clustering of mixed-type data with relevant variable identification


Citation

Burhanuddin, Nurul Afiqah and Ibrahim, Kamarulzaman and Adam, Mohd Bakri and Mustapha, Norwati and Zulkafli, Hani Syahida (2024) Bayesian clustering of mixed-type data with relevant variable identification. Communications in Statistics: Simulation and Computation. ISSN 0361-0918; eISSN: 1532-4141

Abstract

This paper presents a Bayesian nonparametric model for clustering datasets with continuous, ordinal, and nominal variables. The ordinal and nominal variables are treated using the latent variables framework based on the multivariate probit and the multinomial probit models. Combining the continuous variables with the latent continuous variables allows us to jointly model a set of mixed-type variables via the Dirichlet process Gaussian mixture model. The use of hierarchical shrinkage prior on the component means leads to improved clustering performances and provides an intuitive way to identify relevant clustering variables. The numerical results on simulated and real data illustrate the applicability of the proposed model.


Download File

Full text not available from this repository.

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
Faculty of Science
Institute for Mathematical Research
DOI Number: https://doi.org/10.1080/03610918.2024.2361135
Publisher: Taylor and Francis L
Keywords: Dirichlet process mixture model; Latent variables; Model-based clustering; Variable selection
Depositing User: MS. HADIZAH NORDIN
Date Deposited: 02 Mar 2026 00:09
Last Modified: 02 Mar 2026 00:09
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1080/03610918.2024.2361135
URI: http://psasir.upm.edu.my/id/eprint/120059
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item