Affiliation:
1. Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
2. Department of Computer Engineering, Inha University, Incheon 22212, Republic of Korea
Abstract
Hierarchical clustering is a widely used data analysis technique. Typically, tools for this method operate on data in its original, readable form, raising privacy concerns when a clustering task involving sensitive data that must remain confidential is outsourced to an external server. To address this issue, we developed a method that integrates Cheon-Kim-Kim-Song homomorphic encryption (HE), allowing the clustering process to be performed without revealing the raw data. In hierarchical clustering, the two nearest clusters are repeatedly merged until the desired number of clusters is reached. The proximity of clusters is evaluated using various metrics. In this study, we considered two well-known metrics: single linkage and complete linkage. Applying HE to these methods involves sorting encrypted distances, which is a resource-intensive operation. Therefore, we propose a cooperative approach in which the data owner aids the sorting process and shares a list of data positions with a computation server. Using this list, the server can determine the clustering of the data points. The proposed approach ensures secure hierarchical clustering using single and complete linkage methods without exposing the original data.
Funder
National Research Foundation of Korea
Institute of Information and Communications Technology Planning and Evaluation
Inha University Research Grant