Author:
Yan Mengda,Yang Rennong,Zhang Ying,Yue Longfei,Hu Dongyuan
Abstract
AbstractThis paper proposes an algorithm for missile manoeuvring based on a hierarchical proximal policy optimization (PPO) reinforcement learning algorithm, which enables a missile to guide to a target and evade an interceptor at the same time. Based on the idea of task hierarchy, the agent has a two-layer structure, in which low-level agents control basic actions and are controlled by a high-level agent. The low level has two agents called a guidance agent and an evasion agent, which are trained in simple scenarios and embedded in the high-level agent. The high level has a policy selector agent, which chooses one of the low-level agents to activate at each decision moment. The reward functions for each agent are different, considering the guidance accuracy, flight time, and energy consumption metrics, as well as a field-of-view constraint. Simulation shows that the PPO algorithm without a hierarchical structure cannot complete the task, while the hierarchical PPO algorithm has a 100% success rate on a test dataset. The agent shows good adaptability and strong robustness to the second-order lag of autopilot and measurement noises. Compared with a traditional guidance law, the reinforcement learning guidance law has satisfactory guidance accuracy and significant advantages in average time and average energy consumption.
Publisher
Springer Science and Business Media LLC
Reference39 articles.
1. Guo, H., Fu, W., Fu, B., Chen, K. & Yan, J. Smart homing guidance strategy with control saturation against a cooperative target-defender team. J. Syst. Eng. Electron. 30, 366–383 (2019).
2. Shalumov, V. Online launch-time selection using deep learning in a target-missile-defender engagement. J. Aerosp. Inf. Syst. 16, 224–236 (2019).
3. Shi, H., Chen, Z., Zhu, J. & Kuang, M. Model predictive guidance for active aircraft protection from a homing missile. IET Control Theory Appl. 16, 208–218 (2022).
4. Shalumov, V. Cooperative online Guide-Launch-Guide policy in a target-missile-defender engagement using deep reinforcement learning. Aerosp. Sci. Technol. 104, 105996 (2020).
5. Ryoo, C. K., Whang, I. H. & Tahk, M. J. 3-D evasive maneuver policy for anti-ship missiles against close-in weapon systems. In AIAA Guid. Navig. Control Conf. Exhib. (2003).
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献