Abstract
The Density Peak Clustering (DPC) algorithm is a new density-based clustering method. It spends most of its execution time on calculating the local density and the separation distance for each data point in a dataset. The purpose of this study is to accelerate its computation. On average, the DPC algorithm scans half of the dataset to calculate the separation distance of each data point. We propose an approach to calculate the separation distance of a data point by scanning only the neighbors of the data point. Additionally, the purpose of the separation distance is to assist in choosing the density peaks, which are the data points with both high local density and high separation distance. We propose an approach to identify non-peak data points at an early stage to avoid calculating their separation distances. Our experimental results show that most of the data points in a dataset can benefit from the proposed approaches to accelerate the DPC algorithm.
Funder
Ministry of Science and Technology, Taiwan
Subject
Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)
Reference30 articles.
1. Data Clustering: Algorithms and Applications;Aggarwal,2014
2. A Watermarking Method for 3D Printing Based on Menger Curvature and K-Mean Clustering
3. A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise;Ester;KDD,1996
4. Data Mining: Concepts and Techniques;Han,2011
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献