Abstract
Abstract
This article summarizes and evaluates the clustering effects of commonly used clustering algorithms on data sets with different density distributions. In this paper, circled datasets, different sized datasets, and Gaussian mixture datasets were designed as the typical datasets. Then, the K-means, Gaussian mixture clustering, DBSCAN, and Agglomerative clustering were developed to evaluate the clustering performance on these datasets. The results show that the DBSCAN is more stable when the density distributions of the data sets are not clear. Besides, the Agglomerative clustering that calculates the shortest distance can determine the type of data set. Moreover, it is not appropriate to use only a single clustering algorithm to analyze a Gaussian mixture dataset. It is recommended to use multiple clusters to process the dataset after preprocessing.
Subject
General Physics and Astronomy
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献