UPM Institutional Repository

Clustering mixed-type data via Dirichlet process mixture model with cluster-specific covariance matrices


Citation

Burhanuddin, Nurul Afiqah and Ibrahim, Kamarulzaman and Zulkafli, Hani Syahida and Mustapha, Norwati (2024) Clustering mixed-type data via Dirichlet process mixture model with cluster-specific covariance matrices. Symmetry, 16 (6). art. no. 712. ISSN 2073-8994; eISSN: 2073-8994

Abstract

Many studies have shown successful applications of the Dirichlet process mixture model (DPMM) for clustering continuous data. Beyond continuous data, in practice, one can expect to see different data types, including ordinal and nominal data. Existing DPMMs for clustering mixed-type data assume a strict covariance matrix structure, resulting in an overfit model. This article explores a DPMM for mixed-type data that allows the covariance matrix to differ from one cluster to another. We assume an underlying latent variable framework for ordinal and nominal data, which is then modeled jointly with the continuous data. The identifiability issue on the covariance matrix poses computational challenges, thus requiring a nonstandard inferential algorithm. The applicability and flexibility of the proposed model are illustrated through simulation examples and real data applications.


Download File

[img] Text
113587.pdf - Published Version
Available under License Creative Commons Attribution.

Download (1MB)
Official URL or Download Paper: https://www.mdpi.com/2073-8994/16/6/712

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
Faculty of Science
Institute for Mathematical Research
DOI Number: https://doi.org/10.3390/sym16060712
Publisher: Multidisciplinary Digital Publishing Institute (MDPI)
Keywords: Bayesian nonparametric; Dirichlet process mixture model; Latent variables; Mixed-type data; Model-based clustering
Depositing User: Ms. Azian Edawati Zakaria
Date Deposited: 14 Nov 2024 04:00
Last Modified: 14 Nov 2024 04:00
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.3390/sym16060712
URI: http://psasir.upm.edu.my/id/eprint/113587
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item