Predictive air combat decision model with segmented reward allocation-Reference-Cited by-同舟云学术

Predictive air combat decision model with segmented reward allocation

Published:2024-07-22 Issue: Volume: Page:
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Li Yundi,Yuan Yinlong^ORCID,Cheng Yun,Hua Liang

Abstract

AbstractIn air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.

Funder

the Graduate Research and Practice Innovation Program Project of Jiangsu Province

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s40747-024-01556-3.pdf

Reference35 articles.

1. Jordan J (2021) The future of unmanned combat aerial vehicles: an analysis using the three horizons framework. Futures 134:102848. https://doi.org/10.1016/j.futures.2021.102848

2. Song XN, Wu CL, Stojanovic V, Zhang W, Song S (2023) 1 bit encoding-decoding-based event-triggered fixed-time adaptive control for unmanned surface vehicle with guaranteed tracking performance. Control Eng Pract 135:105513. https://doi.org/10.1016/j.conengprac.2023.105513

3. Song XN, Wu CL, Song S, Stojanovic V, Tejado I (2024) Fuzzy wavelet neural adaptive finite-time self-triggered fault-tolerant control for a quadrotor unmanned aerial vehicle with scheduled performance. Eng Appl Artif Intell 131:107832. https://doi.org/10.1016/j.engappai.2023.107832

4. Li Y, Shi J, Jiang W, Zhang W, Lyu Y (2021) Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm. Defence Technol 18:1697–1714. https://doi.org/10.1016/j.dt.2021.09.014

5. Of ARAR, Ayan K (2013) A flexible rule-based framework for pilot performance analysis in air combat simulation systems. Turk J Electr Eng Comput Sci 21:2397–2415. https://doi.org/10.3906/elk-1201-50