BIRCH-Reference-Cited by-同舟云学术

BIRCH

Published:1996-06 Issue:2 Volume:25 Page:103-114
ISSN:0163-5808
Container-title:ACM SIGMOD Record
language:en
Short-container-title:SIGMOD Rec.

Author:

Zhang Tian¹,Ramakrishnan Raghu¹,Livny Miron¹

Affiliation:

1. Computer Sciences Dept., Univ. of Wisconsin-Madison

Abstract

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely studied problems in this area is the identification of clusters, or densely populated regions, in a multi-dimensional dataset. Prior work does not adequately address the problem of large datasets and minimization of I/O costs.This paper presents a data clustering method named BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies), and demonstrates that it is especially suitable for very large databases. BIRCH incrementally and dynamically clusters incoming multi-dimensional metric data points to try to produce the best quality clustering with the available resources (i.e., available memory and time constraints). BIRCH can typically find a good clustering with a single scan of the data, and improve the quality further with a few additional scans. BIRCH is also the first clustering algorithm proposed in the database area to handle "noise" (data points that are not part of the underlying pattern) effectively.We evaluate BIRCH 's time/space efficiency, data input order sensitivity, and clustering quality through several experiments. We also present a performance comparisons of BIRCH versus CLARANS, a clustering method proposed recently for large datasets, and show that BIRCH is consistently superior.

Publisher

Association for Computing Machinery (ACM)

Subject

Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/235968.233324

Reference12 articles.

Cited by 1397 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Anchor-based fast spectral ensemble clustering;Information Fusion;2025-01

2. Integrating machine learning and biosensors in microfluidic devices: A review;Biosensors and Bioelectronics;2024-11

3. Uncovering the financial impact of energy-efficient building characteristics with eXplainable artificial intelligence;Applied Energy;2024-11

4. S3PaR: Section-based Sequential Scientific Paper Recommendation for paper writing assistance;Knowledge-Based Systems;2024-11

5. An improved density peaks clustering algorithm based on the generalized neighbors similarity;Engineering Applications of Artificial Intelligence;2024-10