Abstract
Dynamic analysis and pattern matching techniques are widely used in industry, and they provide a straightforward method for the identification of malware samples. Yara is a pattern matching technique that can use sandbox memory dumps for the identification of malware families. However, pattern matching techniques fail silently due to minor code variations, leading to unidentified malware samples. This paper presents a two-layered Malware Variant Identification using Incremental Clustering (MVIIC) process and proposes clustering of unidentified malware samples to enable the identification of malware variants and new malware families. The novel incremental clustering algorithm is used in the identification of new malware variants from the unidentified malware samples. This research shows that clustering can provide a higher level of performance than Yara rules, and that clustering is resistant to small changes introduced by malware variants. This paper proposes a hybrid approach, using Yara scanning to eliminate known malware, followed by clustering, acting in concert, to allow the identification of new malware variants. F1 score and V-Measure clustering metrics are used to evaluate our results.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Cyber Security Incident Response;Journal of Information Security and Cybercrimes Research;2024-06-02
2. Detection of Malware by Using YARA Rules;2024 21st International Multi-Conference on Systems, Signals & Devices (SSD);2024-04-22
3. YAMME: a YAra-byte-signatures Metamorphic Mutation Engine;IEEE Transactions on Information Forensics and Security;2023
4. On-device Resilient Android Malware Detection using Incremental Learning;Procedia Computer Science;2022