Detecting Word-Based Algorithmically Generated Domains Using Semantic Analysis

Author:

Yang Luhui,Zhai Jiangtao,Liu Weiwei,Ji XiaopengORCID,Bai Huiwen,Liu Guangjie,Dai Yuewei

Abstract

In highly sophisticated network attacks, command-and-control (C&C) servers always use domain generation algorithms (DGAs) to dynamically produce several candidate domains instead of static hard-coded lists of IP addresses or domain names. Distinguishing the domains generated by DGAs from the legitimate ones is critical for finding out the existence of malware or further locating the hidden attackers. The word-based DGAs disclosed in recent network attack events have shown significantly stronger stealthiness when compared with traditional character-based DGAs. In word-based DGAs, two or more words are randomly chosen from one or more specific dictionaries to form a dynamic domain, these regularly generated domains aim to mimic the characteristics of a legitimate domain. Existing DGA detection schemes, including the state-of-the-art one based on deep learning, still cannot find out these domains accurately while maintaining an acceptable false alarm rate. In this study, we exploit the inter-word and inter-domain correlations using semantic analysis approaches, word embedding and the part-of-speech are taken into consideration. Next, we propose a detection framework for word-based DGAs by incorporating the frequency distribution of the words and that of part-of-speech into the design of the feature set. Using an ensemble classifier constructed from Naive Bayes, Extra-Trees, and Logistic Regression, we benchmark the proposed scheme with malicious and legitimate domain samples extracted from public datasets. The experimental results show that the proposed scheme can achieve significantly higher detection accuracy for word-based DGAs when compared with three state-of-the-art DGA detection schemes.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Jiangsu Province

Publisher

MDPI AG

Subject

Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)

Reference34 articles.

1. Detecting Algorithmically Generated Domain-Flux Attacks With DNS Traffic Analysis

2. A Taxonomy of Domain-Generation Algorithms

3. Stealthy Domain Generation Algorithms

4. Matsnu Malware ID. Check Point Blog Posthttps://blog.checkpoint.com/wp-content/uploads/2015/07/Matsnu-malwareid-technical-brief.pdf

Cited by 16 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A review on lexical based malicious domain name detection methods;Annals of Telecommunications;2024-06-13

2. A Novel Model Based on Ensemble Learning for Detecting DGA Botnets;2022 14th International Conference on Knowledge and Systems Engineering (KSE);2022-10-19

3. Malicious Domain Names Detection Algorithm Based on Statistical Features of URLs;2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD);2022-05-04

4. A semantic element representation model for malicious domain name detection;Journal of Information Security and Applications;2022-05

5. Optimal Covert Communication Techniques;International Journal of Informatics and Applied Mathematics;2022-04-11

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3