Author:
Liu Jia-yi,Wang Gang,Guo Xiang-ke,Wang Si-yuan,Fu Qiang
Abstract
Modern air defense battlefield situations are complex and varied, requiring high-speed computing capabilities and real-time situational processing for task assignment. Current methods struggle to balance the quality and speed of assignment strategies. This paper proposes a hierarchical reinforcement learning architecture for ground-to-air confrontation (HRL-GC) and an algorithm combining model predictive control with proximal policy optimization (MPC-PPO), which effectively combines the advantages of centralized and distributed approaches. To improve training efficiency while ensuring the quality of the final decision. In a large-scale area air defense scenario, this paper validates the effectiveness and superiority of the HRL-GC architecture and MPC-PPO algorithm, proving that the method can meet the needs of large-scale air defense task assignment in terms of quality and speed.
Subject
Artificial Intelligence,Biomedical Engineering
Reference32 articles.
1. A theory of state abstraction for reinforcement learning;Abel;Proceedings of the 33rd AAAI conference on artificial intelligence,2019
2. Discrete-time dynamic graphical games: Model-free reinforcement learning solution.;Abouheaf;Control Theory Technol.,2015
3. Sojourn-based approach to semi-markov reinforcement learning.;Ascione;J. Sci. Comput.,2022
4. Constructing temporal abstractions autonomously in reinforcement learning.;Bacon;AI Mag.,2018
5. Trading rules on stock markets using genetic network programming with sarsa learning.;Chen;J. Adv. Comput. Intell. Intell. Inform.,2008
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献