UPM Institutional Repository

Improved normalization and standardization techniques for higher purity in K-means clustering


Citation

Dalatu, Paul Inuwa and Fitrianto, Anwar and Mustapha, Aida (2016) Improved normalization and standardization techniques for higher purity in K-means clustering. Far East Journal of Mathematical Sciences, 100 (6). pp. 859-871. ISSN 0972-0871

Abstract

Clustering is basically one of the major sources of primary data mining tools, which make researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with aim of partitioning, where objects in the same cluster are similar, and objects belong to different clusters vary significantly, with respect to their attributes. The K-means algorithm is a famous and fast technique in non-hierarchical cluster algorithms. Based on its simplicity, the K-means algorithm has been used in many fields. This paper proposes improved normalization and standardization techniques for higher purity in K-means clustering experimented with benchmark datasets from UCI machine learning repository and it was found that all the proposed techniques’ performance was much higher compared to the conventional K-means and the three classic transformations, and it is evidently shown by purity and Rand index accuracy results.


Download File

[img]
Preview
Text
Improved normalization and standardization techniques for higher purity in K-means clustering.pdf

Download (70kB) | Preview
Official URL or Download Paper: http://www.pphmj.com/abstract/10134.htm

Additional Metadata

Item Type: Article
Subject: Normalization; Standardization; K-means algorithm; Clustering; Purity; Rand index
Divisions: Faculty of Science
DOI Number: https://doi.org/10.17654/MS100060859
Publisher: Pushpa Publishing House
Depositing User: Nurul Ainie Mokhtar
Date Deposited: 27 Mar 2018 01:36
Last Modified: 27 Mar 2018 01:36
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.17654/MS100060859
URI: http://psasir.upm.edu.my/id/eprint/54519
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item