Affiliation:
1. School of Mathematics and Statistics, Xidian University, Xi’an 710126, China
Abstract
Outlier detection is of great significance in the domain of data mining. Its task is to find those target points that are not identical to most of the object generation mechanisms. The existing algorithms are mainly divided into density-based algorithms and distance-based algorithms. However, both approaches have some drawbacks. The former struggles to handle low-density modes, while the latter cannot detect local outliers. Moreover, the outlier detection algorithm is very sensitive to parameter settings. This paper proposes a new two-parameter outlier detection (TPOD) algorithm. The method proposed in this paper does not need to manually define the number of neighbors, and the introduction of relative distance can also solve the problem of low density and further accurately detect outliers. This is a combinatorial optimization problem. Firstly, the number of natural neighbors is iteratively calculated, and then the local density of the target object is calculated by adaptive kernel density estimation. Secondly, the relative distance of the target points is computed through natural neighbors. Finally, these two parameters are combined to obtain the outlier factor. This eliminates the influence of parameters that require users to determine the number of outliers themselves, namely, the top-n effect. Two synthetic datasets and 17 real datasets were used to test the effectiveness of this method; a comparison with another five algorithms is also provided. The AUC value and F1 score on multiple datasets are higher than other algorithms, indicating that outliers can be found accurately, which proves that the algorithm is effective.
Funder
National Natural Science Foundation of China
Natural Science Basic Research Program of Shaanxi
Subject
Geometry and Topology,Logic,Mathematical Physics,Algebra and Number Theory,Analysis
Reference35 articles.
1. Data Mining: Concepts and Techniques Third Edition;Han;Morgan Kaufmann Ser. Data Manag. Syst.,2011
2. Progress in outlier detection techniques: A survey;Wang;IEEE Access,2019
3. A comparative evaluation of outlier detection algorithms: Experiments and analyses;Domingues;Pattern Recognit.,2018
4. Safaei, M., Asadi, S., Driss, M., Boulila, W., and Safaei, M. (2020). A systematic literature review on outlier detection in wireless sensor networks. Symmetry, 12.
5. Hawkins, D.M. (1980). Identification of Outliers, Springer.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献