Evaluation of Malware Classification Models for Heterogeneous Data-Reference-Cited by-同舟云学术

Evaluation of Malware Classification Models for Heterogeneous Data

Published:2024-01-03 Issue:1 Volume:24 Page:288
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Bae Ho¹^ORCID

Affiliation:

1. Department of Cyber Security, Ewha Womans University, Seoul 03760, Republic of Korea

Abstract

Machine learning (ML) has found widespread application in various domains. Additionally, ML-based techniques have been employed to address security issues in technology, with numerous studies showcasing their potential and effectiveness in tackling security problems. Over the years, ML methods for identifying malicious software have been developed across various security domains. However, recent research has highlighted the susceptibility of ML models to small input perturbations, known as adversarial examples, which can significantly alter model predictions. While prior studies on adversarial examples primarily focused on ML models for image processing, they have progressively extended to other applications, including security. Interestingly, adversarial attacks have proven to be particularly effective in the realm of malware classification. This study aims to explore the transparency of malware classification and develop an explanation method for malware classifiers. The challenge at hand is more complex than those associated with explainable AI for homogeneous data due to the intricate data structure of malware compared to traditional image datasets. The research revealed that existing explanations fall short in interpreting heterogeneous data. Our employed methods demonstrated that current malware detectors, despite high classification accuracy, may provide a misleading sense of security and measuring classification accuracy is insufficient for validating detectors.

Funder

Institute of Information & Communications Technology Planning & Evaluation

Artificial Intelligence Convergence Innovation Human Resources Development

National Statistics Data While Guaranteeing the Utility of Statistical Analysis

Ewha Womans University

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/24/1/288/pdf

Reference57 articles.

1. Saxe, J., and Berlin, K. (2015, January 20–22). Deep neural network based malware detection using two dimensional binary program features. Proceedings of the 2015 10th International Conference on Malicious and Unwanted Software (MALWARE), Fajardo, PR, USA.

2. Yuan, Z., Lu, Y., Wang, Z., and Xue, Y. (2014, January 17–22). Droid-sec: Deep learning in android malware detection. Proceedings of the 2014 ACM Conference on SIGCOMM, Chicago, IL, USA.

3. Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., and Kirda, E. (2009, January 8–11). Scalable, behavior-based malware clustering. Proceedings of the 16th Annual Network & Distributed System Security Symposium (NDSS 2009), San Diego, CA, USA.

4. Jang, J., Brumley, D., and Venkataraman, S. (2011, January 17–21). Bitshred: Feature hashing malware for scalable triage and semantic analysis. Proceedings of the 18th ACM Conference on Computer and Communications Security, Chicago, IL, USA.

5. Cova, M., Kruegel, C., and Vigna, G. (2010, January 26–30). Detection and analysis of drive-by-download attacks and malicious JavaScript code. Proceedings of the 19th International Conference on World Wide Web, Raleigh North, CO, USA.