Evaluation of Malware Classification Models for Heterogeneous Data
Affiliation:
1. Department of Cyber Security, Ewha Womans University, Seoul 03760, Republic of Korea
Abstract
Machine learning (ML) has found widespread application in various domains. Additionally, ML-based techniques have been employed to address security issues in technology, with numerous studies showcasing their potential and effectiveness in tackling security problems. Over the years, ML methods for identifying malicious software have been developed across various security domains. However, recent research has highlighted the susceptibility of ML models to small input perturbations, known as adversarial examples, which can significantly alter model predictions. While prior studies on adversarial examples primarily focused on ML models for image processing, they have progressively extended to other applications, including security. Interestingly, adversarial attacks have proven to be particularly effective in the realm of malware classification. This study aims to explore the transparency of malware classification and develop an explanation method for malware classifiers. The challenge at hand is more complex than those associated with explainable AI for homogeneous data due to the intricate data structure of malware compared to traditional image datasets. The research revealed that existing explanations fall short in interpreting heterogeneous data. Our employed methods demonstrated that current malware detectors, despite high classification accuracy, may provide a misleading sense of security and measuring classification accuracy is insufficient for validating detectors.
Funder
Institute of Information & Communications Technology Planning & Evaluation Artificial Intelligence Convergence Innovation Human Resources Development National Statistics Data While Guaranteeing the Utility of Statistical Analysis Ewha Womans University
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference57 articles.
1. Saxe, J., and Berlin, K. (2015, January 20–22). Deep neural network based malware detection using two dimensional binary program features. Proceedings of the 2015 10th International Conference on Malicious and Unwanted Software (MALWARE), Fajardo, PR, USA. 2. Yuan, Z., Lu, Y., Wang, Z., and Xue, Y. (2014, January 17–22). Droid-sec: Deep learning in android malware detection. Proceedings of the 2014 ACM Conference on SIGCOMM, Chicago, IL, USA. 3. Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., and Kirda, E. (2009, January 8–11). Scalable, behavior-based malware clustering. Proceedings of the 16th Annual Network & Distributed System Security Symposium (NDSS 2009), San Diego, CA, USA. 4. Jang, J., Brumley, D., and Venkataraman, S. (2011, January 17–21). Bitshred: Feature hashing malware for scalable triage and semantic analysis. Proceedings of the 18th ACM Conference on Computer and Communications Security, Chicago, IL, USA. 5. Cova, M., Kruegel, C., and Vigna, G. (2010, January 26–30). Detection and analysis of drive-by-download attacks and malicious JavaScript code. Proceedings of the 19th International Conference on World Wide Web, Raleigh North, CO, USA.
|
|