Word Embeddings for Fake Malware Generation-Reference-Cited by-同舟云学术

Word Embeddings for Fake Malware Generation

Published:2022 Issue: Volume: Page:22-37
ISSN:1865-0929
Container-title:Silicon Valley Cybersecurity Conference
language:
Short-container-title:

Author:

Tran Quang Duy,Di Troia Fabio^ORCID

Abstract

AbstractSignature and anomaly-based techniques are the fundamental methods to detect malware. However, in recent years this type of threat has advanced to become more complex and sophisticated, making these techniques less effective. For this reason, researchers have resorted to state-of-the-art machine learning techniques to combat the threat of information security. Nevertheless, despite the integration of the machine learning models, there is still a shortage of data in training that prevents these models from performing at their peak. In the past, generative models have been found to be highly effective at generating image-like data that are similar to the actual data distribution. In this paper, we leverage the knowledge of generative modeling on opcode sequences and aim to generate malware samples by taking advantage of the contextualized embeddings from BERT. We obtained promising results when differentiating between real and generated samples. We observe that generated malware has such similar characteristics to actual malware that the classifiers are having difficulty in distinguishing between the two, in which the classifiers falsely identify the generated malware as actual malware almost

$$90\%$$

of the time.

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-24049-2_2

Reference34 articles.

1. Advanced guide to inception V3, Google. https://cloud.google.com/tpu/docs/inception-v3-advanced

2. Aycock, J.: Computer Viruses and Malware. Springer, New York (2006)

3. Computer Communications and Networks;D Dhanasekar,2018

4. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv, abs/1910.01108 (2019)

5. O’Kane, P., Sezer, S., McLaughlin, K.: Obfuscation: the hidden malware. IEEE Secur. Priv. 9(5), 41–47 (2011). https://doi.org/10.1109/MSP.2011.98

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing Malware Detection Using “Genetic Markers” and Machine Learning;2023 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech);2023-11-14