Malware Family Prediction with an Awareness of Label Uncertainty-Reference-Cited by-同舟云学术

Malware Family Prediction with an Awareness of Label Uncertainty

Published:2022-12-17 Issue: Volume: Page:
ISSN:0010-4620
Container-title:The Computer Journal
language:en
Short-container-title:

Author:

Paik Joon-Young¹^ORCID,Jin Rize¹^ORCID

Affiliation:

1. School of Software, Tiangong University , 399 Binshuixi Road, Xiqing District, Tianjin 300387, China

Abstract

Abstract Malware family prediction has been mainly formulated as a multiclass classification to predict one malware family. This approach suffers from label uncertainty, which can mislead malware analysts. To render malware prediction less susceptible to uncertainty, malware family prediction, which entails predicting one or more families, is performed in this study. In this regard, an encoder–decoder malware family prediction model, EnDePMal, with label uncertainty awareness, is proposed. EnDePMal aims to predict all malware families related to samples and preserve their priorities. It comprises a residual neural network-based encoder and a long short-term memory-based decoder with an attention mechanism. The model uses a sequence of malware family names, but not a family name, as a label. Once a visualized malware image is input into EnDePMal, its encoder extracts the important features from the image. Subsequently, its decoder generates family names, where the attention mechanism allows it to focus on relevant features by attending to the encoder’s output. Experimental results show that EnDePMal can predict 77.64% of malware family sequences that preserve their priorities. Moreover, it achieves an accuracy of 93.49% and an F1-score of 0.9282 for malware families with the highest priority, rendering it comparable to the typical multiclass classification model.

Funder

National Natural Science Foundation of China

Publisher

Oxford University Press (OUP)

Subject

General Computer Science

Link

https://academic.oup.com/comjnl/advance-article-pdf/doi/10.1093/comjnl/bxac181/48077092/bxac181.pdf

Reference52 articles.

1. Image-based malware classification using VGG19 network and spatial convolutional attention;Awan;Electronics,2021

2. Visualized malware multi-classification framework using fine-tuned CNN-based transfer learning models;El-Shafai;Appl. Sci.,2021

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-labeling of Malware Samples Using Behavior Reports and Fuzzy Hashing;Communications in Computer and Information Science;2023