Affiliation:
1. Faculty of Computer Science, MSA University, Egypt
Abstract
The required division and exponentiation operations needed per iteration for the possibilistic c-means (PCM) clustering algorithm complicate its implementation, especially on homomorphically-encrypted data. This paper presents a novel efficient soft clustering algorithm based on the possibilistic paradigm, termed SPCM. It aims at easing future applications of PCM to encrypted data. It reduces the required exponentiation and division operations at each iteration by restricting the membership values to an ordered set of discrete values in [0,1], resulting in a better performance in terms of runtime and several other performance indices. At each iteration, distances to the new clusters’ centers are determined, then the distances are compared to the initially computed and dynamically updated range of values, that divide the entire range of distances associated with each cluster center into intervals (bins), to assign appropriate soft memberships to objects. The required number of comparisons is O(log the number of discretization levels). Thus, the computation of centers and memberships is greatly simplified during execution. Also, the use of discrete values for memberships allows soft modification (increment or decrement) of the soft memberships of identified outliers and core objects instead of rough modification (setting to zero or one) in related algorithms. Experimental results on synthetic and standard test data sets verified the efficiency and effectiveness of the proposed algorithm. The average percent of the achieved reduction in runtime is 35% and the average percent of the achieved increase in v-measure, adjusted mutual information, and adjusted rand index is 6% on five datasets compared to PCM. The larger the dataset, the higher the reduction in runtime. Also, SPCM achieved a comparable performance with less computational complexity compared to variants of related algorithms.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability