PUMD: a PU learning-based malicious domain detection framework-Reference-Cited by-同舟云学术

PUMD: a PU learning-based malicious domain detection framework

Published:2022-10-01 Issue:1 Volume:5 Page:
ISSN:2523-3246
Container-title:Cybersecurity
language:en
Short-container-title:Cybersecurity

Author:

Fan Zhaoshan,Wang Qing,Jiao Haoran,Liu Junrong,Cui Zelin,Liu Song,Liu Yuling

Abstract

AbstractDomain name system (DNS), as one of the most critical internet infrastructure, has been abused by various cyber attacks. Current malicious domain detection capabilities are limited by insufficient credible label information, severe class imbalance, and incompact distribution of domain samples in different malicious activities. This paper proposes a malicious domain detection framework named PUMD, which innovatively introduces Positive and Unlabeled (PU) learning solution to solve the problem of insufficient label information, adopts customized sample weight to improve the impact of class imbalance, and effectively constructs evidence features based on resource overlapping to reduce the intra-class distance of malicious samples. Besides, a feature selection strategy based on permutation importance and binning is proposed to screen the most informative detection features. Finally, we conduct experiments on the open source real DNS traffic dataset provided by QI-ANXIN Technology Group to evaluate the PUMD framework’s ability to capture potential command and control (C&C) domains for malicious activities. The experimental results prove that PUMD can achieve the best detection performance under different label frequencies and class imbalance ratios.

Funder

National Key Research and Development Program of Chin

Youth Innovation Promotion Association CAS

the Strategic Priority Research Program of Chinese Academy of Sciences

National Natural Science Foundation of Chin

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Networks and Communications,Information Systems,Software

Link

https://link.springer.com/content/pdf/10.1186/s42400-022-00124-x.pdf

Reference46 articles.

1. ALEXA-INTERNET: Alexa topsites (2021). https://www.alexa.com/topsites. Accessed 20 Aug 2021

2. Almashhadani AO, Kaiiali M, Carlin D, Sezer S (2020) Maldomdetector: a system for detecting algorithmically generated domain names with machine learning. Comput Secur 93:101787

3. Andre Correa: Malware Patrol (2021). https://www.malwarepatrol.net/. Accessed 20 Aug 2021

4. Antonakakis M, Perdisci R, Dagon D, Lee W, Feamster N (2010) Building a dynamic reputation system for DNS. In: USENIX security symposium, pp 273–290

5. Bekker J, Davis J (2020) Learning from positive and unlabeled data: a survey. Mach Learn 109(4):719–760