Affiliation:
1. School of Information Management, Beijing Information Science and Technology University, Beijing 100192, China
Abstract
With the rapid development of emerging technologies such as self-media, the Internet of Things, and cloud computing, massive data applications are crossing the threshold of the era of real-time analysis and value realization, which makes data streams ubiquitous in all kinds of industries. Therefore, detecting anomalies in such data streams could be very important and full of challenges. For example, in industries such as electricity and finance, data stream anomalies often contain information that can help avoiding risks and support decision making. However, most traditional anomaly detection algorithms rely on acquiring global information about the data, which is hard to apply to stream data scenarios. Currently, the reviews of the algorithm in the field of anomaly detection, both domestically and internationally, tend to focus on the exposition of anomaly detection algorithms in static data environments, while lacking in the induction and analysis of anomaly detection algorithms in the context of streaming data. As a result, unlike the existing literature reviews, this review provides the current mainstream anomaly detection algorithms in data streaming scenarios and categorizes them into three types on the basis of their fundamental principles: (1) based on offline learning; (2) based on semi-online learning; (3) based on online learning. This review discusses the current state of research on data stream anomaly detection and studies the key issues in various algorithms for detecting anomalies in data streams on the basis of concise summarization. Moreover, the review conducts a detailed comparison of the pros and cons of the algorithms. Finally, the future challenges in the field are analyzed, and future research directions are proposed.
Funder
The National Key Research and Development Program of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference64 articles.
1. Korycki, Ł., Cano, A., and Krawczyk, B. (2019, January 9–13). Active Learning with Abstaining Classifiers for Imbalanced Drifting Data Streams. Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA.
2. Bhatia, S., Jain, A., Li, P., Kumar, R., and Hooi, B. (2021, January 19–23). MSTREAM: Fast Anomaly Detection in Multi-Aspect Streams. Proceedings of the Web Conference, Ljubljana, Slovenia.
3. Data stream clustering: A review;Atalay;Artif. Intell. Rev.,2021
4. Data stream analysis: Foundations, major tasks and tools;Bahri;Wiley Interdiscip. Rev. Data Min. Knowl. Discov.,2021
5. Real-time big data processing for anomaly detection: A survey;Habeeb;Int. J. Inf. Manag.,2019
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A Zoom-In Process Mining Based Analysis Model for Document Workflows;2024 8th International Conference on Computer, Software and Modeling (ICCSM);2024-07-04
2. Real-Time Anomaly Detection in Large-Scale Sensor Networks using Isolation Forests;2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE);2024-05-09
3. Detecting Network Anomalies Using the Rain Optimization Algorithm and Hoeffding Tree-based Autoencoder;2024 10th International Conference on Web Research (ICWR);2024-04-24
4. Federated Learning for Unsupervised Anomaly Detection in ADLs of Elderly in Single-resident Smart Homes;Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing;2024-04-08
5. Anomaly Detection in Streaming Data using Isolation Forest;2024 Seventh International Women in Data Science Conference at Prince Sultan University (WiDS PSU);2024-03-03