Affiliation:
1. School of Computing, Engineering & the Built Environment, Edinburgh Napier University, Edinburgh EH10 5DT, UK
Abstract
Artificial intelligence and machine learning have become a necessary part of modern living along with the increased adoption of new computational devices. Because machine learning and artificial intelligence can detect malware better than traditional signature detection, the development of new and novel malware aiming to bypass detection has caused a challenge where models may experience concept drift. However, as new malware samples appear, the detection performance drops. Our work aims to discuss the performance degradation of machine learning-based malware detectors with time, also called concept drift. To achieve this goal, we develop a Python-based framework, namely Rapidrift, capable of analysing the concept drift at a more granular level. We also created two new malware datasets, TRITIUM and INFRENO, from different sources and threat profiles to conduct a deeper analysis of the concept drift problem. To test the effectiveness of Rapidrift, various fundamental methods that could reduce the effects of concept drift were experimentally explored.
Subject
Computer Networks and Communications,Human-Computer Interaction
Reference23 articles.
1. Pells, M. (2023). Cyberattack on Yorkshire Coast Firm, Yorkshire Coast News.
2. (2023). Norway Government Ministries Hit by Cyber-Attack, Reuters.
3. Jeong, Y.-S., Woo, J., and Kang, A.R. (2019). Malware Detection on Byte Streams of Hangul Word Processor Files. Appl. Sci., 9.
4. Barbero, F., Pendlebury, F., Pierazzi, F., and Cavallaro, L. (2022, June 25). Transcending TRANSCEND: Revisiting Malware Classification in the Presence of Concept Drift. 22 October 2020. Available online: http://arxiv.org/abs/2010.03856.
5. The Use of Machine Learning Techniques to Advance the Detection and Classification of Unknown Malware;Shhadat;Procedia Comput. Sci.,2020