An Improved Algorithm Based on Fast Search and Find of Density Peak Clustering for High-Dimensional Data-Reference-Cited by-同舟云学术

An Improved Algorithm Based on Fast Search and Find of Density Peak Clustering for High-Dimensional Data

Published:2021-07-26 Issue: Volume:2021 Page:1-12
ISSN:1530-8677
Container-title:Wireless Communications and Mobile Computing
language:en
Short-container-title:Wireless Communications and Mobile Computing

Author:

Du Hui¹^ORCID,Ni Yiyang¹^ORCID,Wang Zhihe¹^ORCID

Affiliation:

1. The School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China

Abstract

The find of density peak clustering algorithm (FDP) has poor performance on high-dimensional data. This problem occurs because the clustering algorithm ignores the feature selection. All features are evaluated and calculated under the same weight, without distinguishing. This will lead to the final clustering effect which cannot achieve the expected. Aiming at this problem, we propose a new method to solve it. We calculate the importance value of all features of high-dimensional data and calculate the mean value by constructing random forest. The features whose importance value is less than 10% of the mean value are removed. At this time, we extract the important features to form a new dataset. At this time, improved t-SNE is used for dimension reduction, and better performance will be obtained. This method uses t-SNE that is improved by the idea of random forest to reduce the dimension of the original data and combines with improved FDP to compose the new clustering method. Through experiments, we find that the evaluation index NMI of the improved algorithm proposed in this paper is 23% higher than that of the original FDP algorithm, and 9.1% higher than that of other clustering algorithms (

K

-means, DBSCAN, and spectral clustering). It has good performance in high-dimensional datasets that are verified by experiments on UCI datasets and wireless sensor networks.

Funder

Northwest Normal University

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems

Link

http://downloads.hindawi.com/journals/wcmc/2021/9977884.pdf

Reference30 articles.

1. Algorithm AS 136: A K-Means Clustering Algorithm

2. Electrically Switchable Chiral Light-Emitting Transistor

3. A density-based algorithm for discovering clusters in large spatial databases with noise;M. Ester