Discovering Expert-Level Air Combat Knowledge via Deep Excitatory-Inhibitory Factorized Reinforcement Learning

Author:

Piao Hai Yin1ORCID,Yang Shengqi2ORCID,Chen Hechang3ORCID,Li Junnan2ORCID,Yu Jin2ORCID,Peng Xuanqi2ORCID,Yang Xin4ORCID,Yang Zhen5ORCID,Sun Zhixiao5ORCID,Chang Yi6ORCID

Affiliation:

1. School of electronics and information, Northwestern Polytechnical University, Xi'an, China

2. SADRI Institute, Shenyang, China

3. Jilin University, Changchun, China

4. Dalian University of Technology, Dalian, China

5. Northwestern Polytechnical University, Xi'an, China

6. School of Artificial Intelligence, Jilin University, Changchun, China

Abstract

Artificial Intelligence (AI) has achieved a wide range of successes in autonomous air combat decision-making recently. Previous research demonstrated that AI-enabled air combat approaches could even acquire beyond human-level capabilities. However, there remains a lack of evidence regarding two major difficulties. First, the existing methods with fixed decision intervals are mostly devoted to solving what to act but merely pay attention to when to act, which occasionally misses optimal decision opportunities. Second, the method of an expert-crafted finite maneuver library leads to a lack of tactics diversity, which is vulnerable to an opponent equipped with new tactics. In view of this, we propose a novel Deep Reinforcement Learning (DRL) and prior knowledge hybrid autonomous air combat tactics discovering algorithm, namely deep E xcitatory-i N hibitory f ACT or I zed maneu VE r ( ENACTIVE ) learning. The algorithm consists of two key modules, i.e., ENHANCE and FACTIVE. Specifically, ENHANCE learns to adjust the air combat decision-making intervals and appropriately seize key opportunities. FACTIVE factorizes maneuvers and then jointly optimizes them with significant tactics diversity increments. Extensive experimental results reveal that the proposed method outperforms state-of-the-art algorithms with a 62% winning rate and further obtains a margin of a 2.85-fold increase in terms of global tactic space coverage. It also demonstrates that a variety of discovered air combat tactics are comparable to human experts’ knowledge.

Publisher

Association for Computing Machinery (ACM)

Reference42 articles.

1. Pierre-Luc Bacon, Jean Harb, and Doina Precup. 2017. The option-critic architecture. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. AAAI Press, 1726–1734. https://ojs.aaai.org/index.php/AAAI/article/view/10916

2. Using Scalable Data Mining for Predicting Flight Delays

3. Evaluating the Effectiveness of Flight Simulators for Training Combat Skills: A Review

4. CORALS

5. André Biedenkapp, Raghu Rajan, Frank Hutter, and Marius Lindauer. 2021. TempoRL: Learning when to act. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. PMLR, 914–924. https://proceedings.mlr.press/v139/biedenkapp21a.html

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3