Author:
Fuhnwi Gerard Shu,Agbaje Janet O.,Oshinubi Kayode,Peter Olumuyiwa James
Abstract
In data mining, and statistics, anomaly detection is the process of finding data patterns (outcomes, values, or observations) that deviate from the rest of the other observations or outcomes. Anomaly detection is heavily used in solving real-world problems in many application domains, like medicine, finance , cybersecurity, banking, networking, transportation, and military surveillance for enemy activities, but not limited to only these fields. In this paper, we present an empirical study on unsupervised anomaly detection techniques such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN), (DBSCAN++) (with uniform initialization, k-center initialization, uniform with approximate neighbor initialization, and $k$-center with approximate neighbor initialization), and $k$-means$--$ algorithms on six benchmark imbalanced data sets. Findings from our in-depth empirical study show that k-means-- is more robust than DBSCAN, and DBSCAN++, in terms of the different evaluation measures (F1-score, False alarm rate, Adjusted rand index, and Jaccard coefficient), and running time. We also observe that DBSCAN performs very well on data sets with fewer number of data points. Moreover, the results indicate that the choice of clustering algorithm can significantly impact the performance of anomaly detection and that the performance of different algorithms varies depending on the characteristics of the data. Overall, this study provides insights into the strengths and limitations of different clustering algorithms for anomaly detection and can help guide the selection of appropriate algorithms for specific applications.
Publisher
Nigerian Society of Physical Sciences
Subject
General Physics and Astronomy,General Mathematics,General Chemistry
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献