Hypergraph-Clustering Method Based on an Improved Apriori Algorithm-Reference-Cited by-同舟云学术

Hypergraph-Clustering Method Based on an Improved Apriori Algorithm

Published:2023-09-22 Issue:19 Volume:13 Page:10577
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Chen Rumeng¹²^ORCID,Hu Feng¹²^ORCID,Wang Feng¹²,Bai Libing¹²

Affiliation:

1. School of Computer, Qinghai Normal University, Xining 810008, China

2. The State Key Laboratory of Tibetan Intelligent Information Processing and Application, School of Computer, Xining 810008, China

Abstract

With the complexity and variability of data structures and dimensions, traditional clustering algorithms face various challenges. The integration of network science and clustering has become a popular field of exploration. One of the main challenges is how to handle large-scale and complex high-dimensional data effectively. Hypergraphs can accurately represent multidimensional heterogeneous data, making them important for improving clustering performance. In this paper, we propose a hypergraph-clustering method dubbed the “high-dimensional data clustering method” based on hypergraph partitioning using an improved Apriori algorithm (HDHPA). First, the method constructs a hypergraph based on the improved Apriori association rule algorithm, where frequent itemsets existing in high-dimensional data are treated as hyperedges. Then, different frequent itemsets are mined in parallel to obtain hyperedges with corresponding ranks, avoiding the generation of redundant rules and improving mining efficiency. Next, we use the dense subgraph partition (DSP) algorithm to divide the hypergraph into multiple subclusters. Finally, we merge the subclusters through dense sub-hypergraphs to obtain the clustering results. The advantage of this method lies in its use of the hypergraph model to discretize the association between data in space, which further enhances the effectiveness and accuracy of clustering. We comprehensively compare the proposed HDHPA method with several advanced hypergraph-clustering methods using seven different types of high-dimensional datasets and then compare their running times. The results show that the clustering evaluation index values of the HDHPA method are generally superior to all other methods. The maximum ARI value can reach 0.834, an increase of 42%, and the average running time is lower than other methods. All in all, HDHPA exhibits an excellent comparable performance on multiple real networks. The research results of this paper provide an effective solution for processing and analyzing large-scale network datasets and are also conducive to broadening the application range of clustering techniques.

Funder

The National Natural Science Foundation of China

Basic Research Program of Qinghai Province

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/19/10577/pdf

Reference41 articles.

1. Adaptive self-paced deep clustering with data augmentation;Guo;IEEE Trans. Knowl. Eng.,2019

2. Mago, N., Shirwaikar, R.D., Acharya, U.D., Hegde, K.G., Lewis, L.E.S., and Shivakumar, M. (2017). Proceedings of International Conference on Cognition and Recognition, Springer.

3. A tutorial on spectral clustering;Von;Stat. Comput.,2007

4. Analysis of data mining K-means clustering algorithm based on partitioning;Zeng;Moder. Electron. Technol.,2020

5. Wang, G.Y. (2020). A Preliminary Study on Uncertainty-Oriented Data Clustering. [Master’s Thesis, Jilin University].

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Application and analysis of online and offline blended teaching mode based on online and offline in art theory course civics;Applied Mathematics and Nonlinear Sciences;2024-01-01