Citation
Abstract
Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis.
Download File
Official URL or Download Paper: http://www.pertanika.upm.edu.my/Pertanika%20PAPERS...
|
Additional Metadata
Item Type: | Article |
---|---|
Divisions: | Faculty of Science Institute for Mathematical Research |
Publisher: | Universiti Putra Malaysia Press |
Keywords: | Clustering; Estimators; K-means; Simulation; Weighted |
Depositing User: | Nabilah Mustapa |
Date Deposited: | 12 Feb 2019 07:04 |
Last Modified: | 12 Feb 2019 07:04 |
URI: | http://psasir.upm.edu.my/id/eprint/66312 |
Statistic Details: | View Download Statistic |
Actions (login required)
View Item |