Can exclusive clustering on streaming data be achieved?-Reference-Cited by-同舟云学术

Can exclusive clustering on streaming data be achieved?

Published:2006-12 Issue:2 Volume:8 Page:102-108
ISSN:1931-0145
Container-title:ACM SIGKDD Explorations Newsletter
language:en
Short-container-title:SIGKDD Explor. Newsl.

Author:

Orlowska Maria E.¹,Sun Xingzhi¹,Li Xue¹

Affiliation:

1. The University of Queensland, Brisbane, Australia

Abstract

Clustering on streaming data aims at partitioning a list of data points into k groups of "similar" objects by scanning the data once. Most current one-scan clustering algorithms do not keep original data in the resulting clusters. The output of the algorithms is therefore not the clustered data points but the approximations of data properties according to the predefined similarity function, such that k centers and radiuses reflect the up-to-date data grouping. In this paper, we raise a critical question: can the partition-based clustering, or exclusive clustering, be achieved on streaming data by those currently available algorithms? After identifying the differences between traditional clustering and clustering on data streams, we discuss the basic requirements for the clusters that can be discovered from streaming data. We evaluate the recent work that is based on a subcluster maintenance approach. By using a few straightforward examples we illustrate that the subcluster maintenance approach may fail to resolve the exclusive clustering on data streams. Based on our observations, we also present the challenges on any heuristic method that claims solving the clustering problem on data streams in general.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/1233321.1233339

Reference13 articles.

1. Models and issues in data stream systems

2. Density-Based Clustering over an Evolving Data Stream with Noise

3. Better streaming algorithms for clustering problems

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DASC: data aware algorithm for scalable clustering;Knowledge and Information Systems;2016-06-01

2. Clustering data streams using grid-based synopsis;Knowledge and Information Systems;2013-06-14

3. Exclusive and Complete Clustering of Streams;Lecture Notes in Computer Science

4. Rek-Means: A k-Means Based Clustering Algorithm;Lecture Notes in Computer Science