Automated Model Learning for Accurate Detection of Malicious Digital Documents-Reference-Cited by-同舟云学术

Automated Model Learning for Accurate Detection of Malicious Digital Documents

Published:2020-09-30 Issue:3 Volume:1 Page:1-21
ISSN:2692-1626
Container-title:Digital Threats: Research and Practice
language:en
Short-container-title:Digital Threats: Research and Practice

Author:

Scofield Daniel¹,Miles Craig¹^ORCID,Kuhn Stephen²

Affiliation:

1. Assured Information Security, Suite, Portland, OR

2. Air Force Research Laboratory, Rome, NY

Abstract

Modern cyber attacks are often conducted by distributing digital documents that contain malware. The approach detailed herein, which consists of a classifier that uses features derived from dynamic analysis of a document viewer as it renders the document in question, is capable of classifying the disposition of digital documents with greater than 98% accuracy even when its model is trained on just small amounts of data. To keep the classification model itself small and thereby to provide scalability, we employ an entity resolution strategy that merges syntactically disparate features that are thought to be semantically equivalent but vary due to programmatic randomness. Entity resolution enables construction of a comprehensive model of benign functionality using relatively few training documents, and the model does not improve significantly with additional training data. In particular, we describe and quantitatively evaluate a fully automated, document format--agnostic approach for learning a classification model that provides efficacious malicious document detection.

Funder

Air Force Research Laboratory

Publisher

Association for Computing Machinery (ACM)

Subject

General Medicine

Link

https://dl.acm.org/doi/pdf/10.1145/3379505

Reference25 articles.

1. The S2E platform: Design, implementation, and applications;Chipounov Vitaly;ACM Trans. Comput. Syst.,2012

2. Clustering by Compression

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. NFDD: A Dynamic Malicious Document Detection Method Without Manual Feature Dictionary;Wireless Algorithms, Systems, and Applications;2021