Affiliation:
1. Cybersecurity Lab, BV TECH S.p.A., 20123 Milan, Italy
Abstract
Reports produced by popular malware analysis services showed a disparity in samples available for different malware families. The unequal distribution between such classes can be attributed to several factors, such as technological advances and the application domain that seeks to infect a computer virus. Recent studies have demonstrated the effectiveness of deep learning (DL) algorithms when learning multi-class classification tasks using imbalanced datasets. This can be achieved by updating the learning function such that correct and incorrect predictions performed on the minority class are more rewarded or penalized, respectively. This procedure can be logically implemented by leveraging the deep reinforcement learning (DRL) paradigm through a proper formulation of the Markov decision process (MDP). This paper proposes SINNER, i.e., a DRL-based multi-class classifier that approaches the data imbalance problem at the algorithmic level by exploiting a redesigned reward function, which modifies the traditional MDP model used to learn this task. Based on the experimental results, the proposed formula appears to be successful. In addition, SINNER has been compared to several DL-based models that can handle class skew without relying on data-level techniques. Using three out of four datasets sourced from the existing literature, the proposed model achieved state-of-the-art classification performance.
Funder
Fondo Europeo di Sviluppo Regionale Puglia
Reference89 articles.
1. Aboaoja, F.A., Zainal, A., Ghaleb, F.A., Al-rimy, B.A.S., Eisa, T.A.E., and Elnour, A.A.H. (2022). Malware detection issues, challenges, and future directions: A survey. Appl. Sci., 12.
2. A Survey on malware analysis and mitigation techniques;Sangeetha;Comput. Sci. Rev.,2019
3. Xu, L., and Qiao, M. (2022, January 22–24). Yara rule enhancement using Bert-based strings language model. Proceedings of the 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), Wuhan, China.
4. YAMME: A YAra-byte-signatures Metamorphic Mutation Engine;Coscia;IEEE Trans. Inf. Forensics Secur.,2023
5. Dynamic Malware Analysis in the Modern Era—A State of the Art Survey;Nissim;ACM Comput. Surv.,2019