DQfD-AIPT: An Intelligent Penetration Testing Framework Incorporating Expert Demonstration Data-Reference-Cited by-同舟云学术

DQfD-AIPT: An Intelligent Penetration Testing Framework Incorporating Expert Demonstration Data

Published:2023-05-04 Issue: Volume:2023 Page:1-15
ISSN:1939-0122
Container-title:Security and Communication Networks
language:en
Short-container-title:Security and Communication Networks

Author:

Wang Yongjie¹²,Li Yang¹²^ORCID,Xiong Xinli¹²,Zhang Jingye¹²,Yao Qian¹²,Shen Chuanxin¹²

Affiliation:

1. College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China

2. Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China

Abstract

The application of reinforcement learning (RL) methods of artificial intelligence for penetration testing (PT) provides a solution to the current problems of high labour costs and high reliance on expert knowledge for manual PT. In order to improve the efficiency of RL algorithms for PT, existing research has considered bringing in the knowledge of PT experts and combining it with the use of imitative learning methods to guide the agent in its decision-making. However, the disadvantage of using imitation learning is also obvious; that is, the performance of the strategies learned by the agent hardly exceeds the demonstrated behaviour of the expert and it can also cause expert knowledge overfitting. At the same time, the expert knowledge in the currently proposed method is poorly interpretable and highly scenario-dependent. The expert knowledge used in these methods is not universal. To address these issues, we propose an intelligent PT framework named DQfD-AIPT. The framework encompasses the process of collecting and using expert knowledge and provides a rational definition of the structure of expert knowledge. To solve the overfitting problem, we perform PT path planning based on the deep Q-learning from demonstrations (DQfD) algorithm. DQfD combines the benefits of RL and imitation learning to effectively improve the PT strategy and performance of agents while avoiding overfitting. Finally, we conducted experiments in a simulated network scenario containing honeypots. The experimental results proved the effectiveness of expert knowledge incorporation. In addition, the DQfD algorithm can improve the efficiency of penetration testing more effectively than that by the classical deep reinforcement learning (DRL) method and can obtain a higher cumulative reward. Not only that, due to the incorporation of expert knowledge, in scenarios with honeypots, the DQfD method can effectively reduce the probability of interacting with honeypots compared to the classical DRL method.

Publisher

Hindawi Limited

Subject

Computer Networks and Communications,Information Systems

Link

http://downloads.hindawi.com/journals/scn/2023/5834434.pdf

Reference28 articles.

1. Autonomous security analysis and penetration testing;A. Chowdhary

2. Domain-independent intelligent planning technology and its application to automated penetration testing oriented attack path discovery;Y. Zhang;Electron. Inf. Technol,2020

3. On the Role of Age of Information in the Internet of Things

4. Age of Information in Energy Harvesting Aided Massive Multiple Access Networks

5. Avoiding the weaknesses of a penetration test

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automated Penetration Testing Based on Adversarial Inverse Reinforcement Learning;2024 International Russian Smart Industry Conference (SmartIndustryCon);2024-03-25