Affiliation:
1. Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11362, Saudi Arabia
Abstract
Semi-supervised clustering typically relies on both labeled and unlabeled data to guide the learning process towards the optimal data partition and to prevent falling into local minima. However, researchers’ efforts made to improve existing semi-supervised clustering approaches are relatively scarce compared to the contributions made to enhance the state-of-the-art fully unsupervised clustering approaches. In this paper, we propose a novel semi-supervised deep clustering approach, named Soft Constrained Deep Clustering (SC-DEC), that aims to address the limitations exhibited by existing semi-supervised clustering approaches. Specifically, the proposed approach leverages a deep neural network architecture and generates fuzzy membership degrees that better reflect the true partition of the data. In particular, the proposed approach uses side-information and formulates it as a set of soft pairwise constraints to supervise the machine learning process. This supervision information is expressed using rather relaxed constraints named “should-link” constraints. Such constraints determine whether the pairs of data instances should be assigned to the same or different cluster(s). In fact, the clustering task was formulated as an optimization problem via the minimization of a novel objective function. Moreover, the proposed approach’s performance was assessed via extensive experiments using benchmark datasets. Furthermore, the proposed approach was compared to relevant state-of-the-art clustering algorithms, and the obtained results demonstrate the impact of using minimal previous knowledge about the data in improving the overall clustering performance.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference63 articles.
1. Intrusion detection system using support vector machine;Jha;Int. J. Appl. Inf. Syst.,2013
2. Dhankhad, S., Mohammed, E., and Far, B. (2018, January 6–9). Supervised Machine Learning Algorithms for Credit Card Fraudulent Transaction Detection: A Comparative Study. Proceedings of the 2018 IEEE International Conference on Information Reuse and Integration (IRI), Salt Lake City, UT, USA.
3. Supervised machine learning algorithms for protein structure classification;Jain;Comput. Biol. Chem.,2009
4. Talabis, M., McPherson, R., Miyamoto, I., and Martin, J. (2014). Information Security Analytics: Finding Security Insights, Patterns, and Anomalies in Big Data, Syngress.
5. A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture;Min;IEEE Access,2018