Malware detection framework based on graph variational autoencoder extracted embeddings from API-call graphs-Reference-Cited by-同舟云学术

Malware detection framework based on graph variational autoencoder extracted embeddings from API-call graphs

Published:2022-05-18 Issue: Volume:8 Page:e988
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Gunduz Hakan¹

Affiliation:

1. Software Engineering Department, Kocaeli University, Kocaeli, Marmara, Turkey

Abstract

Malware harms the confidentiality and integrity of the information that causes material and moral damages to institutions or individuals. This study proposed a malware detection model based on API-call graphs and used Graph Variational Autoencoder (GVAE) to reduce the size of graph node features extracted from Android apk files. GVAE-reduced embeddings were fed to linear-based (SVM) and ensemble-based (LightGBM) models to finalize the malware detection process. To validate the effectiveness of the GVAE-reduced features, recursive feature elimination (RFE) and Fisher score (FS) were applied to select informative feature sets with the same sizes as GVAE-reduced embeddings. The results with RFE and FS selections revealed that LightGBM and RFE-selected 50 features achieved the highest accuracy (0.907) and F-measure (0.852) rates. When we used GVAE-reduced embeddings in the classification, there was an approximate increase of %4 in both models’ accuracy rates. The same performance increase occurred in F-measure rates which directly indicated the improvement in the discrimination powers of the models. The last conducted experiment that combined the strengths of RFE selection and GVAE led to a performance increase compared to only GVAE-reduced embeddings. RFE selection achieved an accuracy rate of 0.967 in LightGBM with the help of selected 30 relevant features from the combination of all GVAE-embeddings.

Publisher

PeerJ

Subject

General Computer Science

Link

https://peerj.com/articles/cs-988.pdf

Reference45 articles.

1. Feature selection using a machine learning to classify a malware;Al-Kasassbeh,2020

2. An efficient android malware prediction using Ensemble machine learning algorithms;Al Sarah;Procedia Computer Science,2021

3. DL-Droid: deep learning based android malware detection using real devices;Alzaylaee;Computers & Security,2020

4. Android malware detection through generative adversarial networks;Amin;Transactions on Emerging Telecommunications Technologies,2019

5. Variational autoencoder based anomaly detection using reconstruction probability;An;Special Lecture on IE,2015

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparative analysis of BERT and FastText representations on crowdfunding campaign success prediction;PeerJ Computer Science;2024-09-11

2. A Survey on Malware Detection with Graph Representation Learning;ACM Computing Surveys;2024-06-29

3. A Kullback-Liebler divergence-based representation algorithm for malware detection;PeerJ Computer Science;2023-09-22