CVDP k-means clustering algorithm for differential privacy based on coefficient of variation-Reference-Cited by-同舟云学术

CVDP k-means clustering algorithm for differential privacy based on coefficient of variation

Published:2022-09-22 Issue:5 Volume:43 Page:6027-6045
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Kong Yuting¹²³,Qian Yurong¹²³,Tan Fuxiang¹²³,Bai Lu¹²³,Shao Jinxin¹²³,Ma Tinghuai⁴,Tereshchenko Sergei Nikolayevich⁵

Affiliation:

1. School of Software, Xinjiang University, Urumqi, Xinjiang Uygur Autonomous Region, China

2. Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region, Xinjiang University, Urumqi, China

3. Key Laboratory of Software Engineering, Xinjiang University, Urumqi, China

4. Nanjing University of Information Science & Technology, Nanjing, China

5. Novosibirsk State University of Economics and Management (NSUEM), Russia

Abstract

Data clustering has been applied and developed in all walks of life, which can provide convenience for enterprise service optimization. However, when the original data to be analyzed contains users’ personal privacy information, the clustering analysis process of the data holder may expose users’ privacy. Differential privacy k-means algorithm is a clustering method based on differential privacy protection technology, which can solve the privacy disclosure problem in the process of data clustering. In the differential privacy k-means algorithm, Laplacian noise controlled by privacy parameter ɛ is added to the center point of clustering to protect user sensitive information and clustering results in the original data, but the addition of noise will affect the utility of clustering. In order to balance the availability and privacy of the differential privacy k-means clustering algorithm, the research on the improvement of the algorithm pays more attention to the selection of the initial clustering center or the optimization of the outlier processing, but does not consider the different contribution degree of each dimension data to the clustering. Therefore, this paper proposes a differential privacy CVDP k-means clustering algorithm based on coefficient of variation. The CVDP scheme first eliminates outliers in the original data through data density, and then designs weighted data point similarity calculation method and initial centroid selection method using variation coefficient. Experimental results show that CVDP k-means algorithm has some improvements in availability, performance and privacy.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference27 articles.

1. Survey on Privacy-Preserving Machine Learning;Liu;Journal of Computer Research and Development,2020

2. Survey on privacy preserving techniques for machine learning;Tan;Journal of Software,2020

3. Data mining privacy preserving: Research agenda,e;Kreso;Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery,2021

4. Improving healthcare services using source anonymous scheme with privacy preserving distributed healthcare data collection and mining;Domadiya;Computing,2021

5. Privacy-preserving data mining of cross-border financial flows;Sekgoka;Cogent Engineering,2022

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Study on stability evaluation of goaf based on AHP and EWM—taking the northern new district of Liaoyuan city as an example;Scientific Reports;2024-08-02

2. Study on Stability Evaluation of Goaf Based on AHP and EWM—Taking the Northern New District of Liaoyuan City as an Example;2024-06-04

3. GAPBAS: Genetic algorithm-based privacy budget allocation strategy in differential privacy K-means clustering algorithm;Computers & Security;2024-04

4. Research on Local Fingerprint Image Differential Privacy Protection Method Based on Clustering Algorithm and Regression Algorithm Segmentation Image;IEEE Access;2024

5. An Improved Density Peak Clustering Algorithm Based on Chebyshev Inequality and Differential Privacy;Applied Sciences;2023-07-27