Anomaly detection research using Isolation Forest in Machine Learning

Author:

Kechedzhiev A. S.1,Tsvetkova O. L.1

Affiliation:

1. Don State Technical University

Abstract

Objective. The study is devoted to assessing the applicability of the Isolation Forest method in the task of detecting anomalies in network traffic data characterized by insufficient markup. The main purpose of the work is to evaluate the effectiveness of Isolation Forest with limited data markup and its potential in critical areas such as cybersecurity and financial analytics.Method. The study includes data preprocessing, training the model on the training set, and evaluating the model's performance on the test set using accuracy metrics, error matrix, and classification report. To implement this research, the Python programming language and the scikit-learn library were chosen to implement the Isolation Forest, as well as Pandas for working with data.Result. Evaluating the applicability of the Isolation Forest method on unstructured data revealed its potential for identifying anomalous patterns without the need for extensive labeling. This confirms the effectiveness of Isolation Forest in environments where access to labeled data is limited or absent.Conclusion. The results demonstrate high anomaly detection recall despite relatively low overall accuracy, indicating the importance of contextual interpretation of metrics in the task of detecting rare events in data.

Publisher

FSB Educational Establishment of Higher Education Daghestan State Technical University

Reference10 articles.

1. Popova, I.A. Detection of anomalies in a data set using unsupervised machine learning algorithms Isolation Forest and Local Outlier Factor/ I.A. Popova StudNet. 2020; 3(12):1460-1470. – EDN XILRBX. (In Russ)

2. Gaiduk, K.A. On the issue of implementing algorithms for identifying internal threats using machine learning / K.A. Gaiduk, A.Yu. Iskhakov. Bulletin of SibGUTI. 2022;16(4):P. 80-95. – DOI 10.55648/1998-6920-2022- 16-4-80-95. – EDN SGBSIH. (In Russ)

3. Savitsky, D.E. Detecting anomalies when processing streaming data in real time / D.E. Savitsky, M.E. Dunaev, K.S. Zaitsev. International Journal of Open Information Technologies. 2022;10(6):70-76. – EDN IGAWAO. (In Russ)

4. Terskikh, M. G. Detection of anomalous user behavior in Windows security event logs using machine learning algorithms / M. G. Terskikh, E. M. Tishina. Theory and practice of modern science. 2018; 5(35): 821-839. – EDN UYMTHC. (In Russ)

5. Dynamic user authentication based on analysis of work with a computer mouse / A. V. Berezniker, M. A. Kazachuk, I. V. Mashechkin [etc.]. Bulletin of Moscow University. Episode 15: Computational mathematics and cybernetics. 2021; 4: 3-16. – EDN XIQNIZ. (In Russ)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3