A Policy-Reuse Algorithm Based on Destination Position Prediction for Aircraft Guidance Using Deep Reinforcement Learning-Reference-Cited by-同舟云学术

A Policy-Reuse Algorithm Based on Destination Position Prediction for Aircraft Guidance Using Deep Reinforcement Learning

Published:2022-10-22 Issue:11 Volume:9 Page:632
ISSN:2226-4310
Container-title:Aerospace
language:en
Short-container-title:Aerospace

Author:

Wang Zhuang^ORCID,Ai Yi,Zuo Qinghai,Zhou Shaowu,Li Hui^ORCID

Abstract

Artificial intelligence for aircraft guidance is a hot research topic, and deep reinforcement learning is one of the promising methods. However, due to the different movement patterns of destinations in different guidance tasks, it is inefficient to train agents from scratch. In this article, a policy-reuse algorithm based on destination position prediction is proposed to solve this problem. First, the reward function is optimized to improve flight trajectory quality and training efficiency. Then, by predicting the possible termination position of the destinations in different moving patterns, the problem is transformed into a fixed-position destination aircraft guidance problem. Last, taking the agent in the fixed-position destination scenario as the baseline agent, a new guidance agent can be trained efficiently. Simulation results show that this method can significantly improve the training efficiency of agents in new tasks, and its performance is stable in tasks with different similarities. This research broadens the application scope of the policy-reuse approach and also enlightens the research in other fields.

Funder

Guangxi Key Laboratory of International Join for China-ASEAN Comprehensive Transportation

Fundamental Research Funds for the Central Universities

Publisher

MDPI AG

Subject

Aerospace Engineering

Link

https://www.mdpi.com/2226-4310/9/11/632/pdf

Reference23 articles.

1. Optimal Control Techniques in Aircraft Guidance and Control

2. Analysis of rendezvous guidance laws for autonomous aerial refueling for non-maneuvering and identical speed targets

3. Switched and Symmetric Pursuit/Evasion Games Using Online Model Predictive Control With Application to Autonomous Aircraft

4. Design of an aerial combat guidance law using virtual pursuit point concept

5. Deep Reinforcement Learning: A Brief Survey

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Motion position prediction and machining accuracy compensation of galvanometer scanner based on BWO-GRU model;Mechanical Systems and Signal Processing;2024-03