Penetration Strategy for High-Speed Unmanned Aerial Vehicles: A Memory-Based Deep Reinforcement Learning Approach

Author:

Zhang Xiaojie123,Guo Hang234ORCID,Yan Tian123ORCID,Wang Xiaoming5,Sun Wendi6,Fu Wenxing123,Yan Jie234

Affiliation:

1. Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, China

2. National Key Laboratory of Unmanned Aerial Vehicle Technology, Northwestern Polytechnical University, Xi’an 710072, China

3. Integrated Research and Development Platform of Unmanned Aerial Vehicle Technology, Xi’an 710072, China

4. Research Center for Unmanned System Strategy Development, Northwestern Polytechnical University, Xi’an 710072, China

5. Beijing Institute of Tracking and Telecommunication Technology, Beijing 100094, China

6. Science and Technology on Complex System Control and Intelligent Agent Cooperative Laboratory, Beijing 100074, China

Abstract

With the development and strengthening of interception measures, the traditional penetration methods of high-speed unmanned aerial vehicles (UAVs) are no longer able to meet the penetration requirements in diversified and complex combat scenarios. Due to the advancement of Artificial Intelligence technology in recent years, intelligent penetration methods have gradually become promising solutions. In this paper, a penetration strategy for high-speed UAVs based on improved Deep Reinforcement Learning (DRL) is proposed, in which Long Short-Term Memory (LSTM) networks are incorporated into a classical Soft Actor–Critic (SAC) algorithm. A three-dimensional (3D) planar engagement scenario of a high-speed UAV facing two interceptors with strong maneuverability is constructed. According to the proposed LSTM-SAC approach, the reward function is designed based on the criteria for successful penetration, taking into account energy and flight range constraints. Then, an intelligent penetration strategy is obtained by extensive training, which utilizes the motion states of both sides to make decisions and generate the penetration overload commands for the high-speed UAV. The simulation results show that compared with the classical SAC algorithm, the proposed algorithm has a training efficiency improvement of 75.56% training episode reduction. Meanwhile, the LSTM-SAC approach achieves a successful penetration rate of more than 90% in hypothetical complex scenarios, with a 40% average increase compared with the conventional programmed penetration methods.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3