UPM Institutional Repository

Improved clustering using robust and classical principal component


Citation

Hassn, Ahmed Kadom (2017) Improved clustering using robust and classical principal component. Masters thesis, Universiti Putra Malaysia.

Abstract

k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA.


Download File

[img] Text
FS 2017 47 UPM.pdf

Download (1MB)

Additional Metadata

Item Type: Thesis (Masters)
Subject: Algorithms
Call Number: FS 2017 47
Chairman Supervisor: Anwar Fitrianto, PhD
Divisions: Faculty of Science
Depositing User: Editor
Date Deposited: 07 Aug 2019 06:51
Last Modified: 07 Jul 2022 03:07
URI: http://psasir.upm.edu.my/id/eprint/70922
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item