Author:
Yan An,Wang Wei,Ren Yi,Geng HongWei
Abstract
The problems of data abnormalities and missing data are puzzling the traditional multi-modal heterogeneous big data clustering. In order to solve this issue, a multi-view heterogeneous big data clustering algorithm based on improved Kmeans clustering is established in this paper. At first, for the big data which involve heterogeneous data, based on multi view data analyzing, we propose an advanced Kmeans algorithm on the base of multi view heterogeneous system to determine the similarity detection metrics. Then, a BP neural network method is used to predict the missing attribute values, complete the missing data and restore the big data structure in heterogeneous state. Last, we ulteriorly propose a data denoising algorithm to denoise the abnormal data. Based on the above methods, we construct a framework namely BPK-means to resolve the problems of data abnormalities and missing data. Our solution approach is evaluated through rigorous performance evaluation study. Compared with the original algorithm, both theoretical verification and experimental results show that the accuracy of the proposed method is greatly improved.
Subject
Artificial Intelligence,Biomedical Engineering
Reference27 articles.
1. Prediction of hydraulics performance in drain envelopes using Kmeans based multivariate adaptive regression spline;Adnan;Appl. Soft Comput.,2020
2. “Multi-view clustering,”;Bickel;Proceedings of the IEEE International Conference on Data Mining,2004
3. Outliers in rules - the comparision of LOF, COF and KMEANS algorithms;Brzezińska;Proc. Comput. Sci.,2020
4. Dual distance adaptive multiview clustering;Chen;Neurocomputing,2021
5. “Spectral clustering with two views,”;De Sa,2005
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献