UPM Institutional Repository

Data stream clustering by divide and conquer approach based on vector model


Citation

Khalilian, Madjid and Mustapha, Norwati and Sulaiman, Nasir (2016) Data stream clustering by divide and conquer approach based on vector model. Journal of Big Data, 3 (1). pp. 1-21. ISSN 2196-1115

Abstract

Recently, many researchers have focused on data stream processing as an efficient method for extracting knowledge from big data. Data stream clustering is an unsupervised approach that is employed for huge data. The continuous effort on data stream clustering method has one common goal which is to achieve an accurate clustering algorithm. However, there are some issues that are overlooked by the previous works in proposing data stream clustering solutions; (1) clustering dataset including big segments of repetitive data, (2) monitoring clustering structure for ordinal data streams and (3) determining important parameters such as k number of exact clusters in stream of data. In this paper, DCSTREAM method is proposed with regard to the mentioned issues to cluster big datasets using the vector model and k-Means divide and conquer approach. Experimental results show that DCSTREAM can achieve superior quality and performance as compare to STREAM and ConStream methods for abrupt and gradual real world datasets. Results show that the usage of batch processing in DCSTREAM and ConStream is time consuming compared to STREAM but it avoids further analysis for detecting outliers and novel micro-clusters.


Download File

[img] PDF
Data stream clustering by divide.pdf
Restricted to Repository staff only

Download (2MB)

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.1186/s40537-015-0036-x
Publisher: Springer
Keywords: Data mining; Data stream clustering; Vector space model; Divide and conquer
Depositing User: Mohd Hafiz Che Mahasan
Date Deposited: 02 Oct 2017 07:53
Last Modified: 02 Oct 2017 07:53
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1186/s40537-015-0036-x
URI: http://psasir.upm.edu.my/id/eprint/55419
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item