OPTICS-Reference-Cited by-同舟云学术

OPTICS

Published:1999-06 Issue:2 Volume:28 Page:49-60
ISSN:0163-5808
Container-title:ACM SIGMOD Record
language:en
Short-container-title:SIGMOD Rec.

Author:

Ankerst Mihael¹,Breunig Markus M.¹,Kriegel Hans-Peter¹,Sander Jörg¹

Affiliation:

1. Institute for Computer Science, University of Munich, Oettingenstr, 67, D-80538 Munich, Germany

Abstract

Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. We show how to automatically and efficiently extract not only 'traditional' clustering information (e.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration of the intrinsic clustering structure offering additional insights into the distribution and correlation of the data.

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/304181.304187

Reference26 articles.

1. Automatic subspace clustering of high dimensional data for data mining applications

2. The R*-tree: an efficient and robust access method for points and rectangles

Cited by 1633 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Integrating supervised and unsupervised learning approaches to unveil critical process inputs;Computers & Chemical Engineering;2025-01

2. Density peaks clustering based on Gaussian fuzzy neighborhood with noise parameter;Expert Systems with Applications;2024-12

3. How Gaia sheds light on the Milky Way star cluster population;New Astronomy Reviews;2024-12

4. Terraced compression method with automated threshold selection using GMM algorithm for heterogeneous bodies detection;Measurement;2024-10

5. Regional freight accessibility analysis based on truck trajectories—A case study of Hunan Province in China;Research in Transportation Business & Management;2024-10