Autonomous maneuver decision-making method based on reinforcement learning and Monte Carlo tree search-Reference-Cited by-同舟云学术

Autonomous maneuver decision-making method based on reinforcement learning and Monte Carlo tree search

Published:2022-10-25 Issue: Volume:16 Page:
ISSN:1662-5218
Container-title:Frontiers in Neurorobotics
language:
Short-container-title:Front. Neurorobot.

Author:

Zhang Hongpeng,Zhou Huan,Wei Yujie,Huang Changqiang

Abstract

Autonomous maneuver decision-making methods for air combat often rely on human knowledge, such as advantage functions, objective functions, or dense rewards in reinforcement learning, which limits the decision-making ability of unmanned combat aerial vehicle to the scope of human experience and result in slow progress in maneuver decision-making. Therefore, a maneuver decision-making method based on deep reinforcement learning and Monte Carlo tree search is proposed to investigate whether it is feasible for maneuver decision-making without human knowledge or advantage function. To this end, Monte Carlo tree search in continuous action space is proposed and neural networks-guided Monte Carlo tree search with self-play is utilized to improve the ability of air combat agents. It starts from random behaviors and generates samples consisting of states, actions, and results of air combat through self-play without using human knowledge. These samples are used to train the neural network, and the neural network with a greater winning rate is selected by simulations. Then, repeat the above process to gradually improve the maneuver decision-making ability. Simulations are conducted to verify the effectiveness of the proposed method, and the kinematic model of the missile is used in simulations instead of the missile engagement zone to test whether the maneuver decision-making method is effective or not. The simulation results of the fixed initial state and random initial state show that the proposed method is efficient and can meet the real-time requirement.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Shaanxi Province

Publisher

Frontiers Media SA

Subject

Artificial Intelligence,Biomedical Engineering

Reference40 articles.

1. Learning to play chess using temporal differences;Baxter;Mach. Learn,2000

2. Maneuvering decision in air combat based on multi-objective optimization and reinforcement learning;Du;J. Beij. Uni. Aero. Astronau,2018

3. “A differential game approach for beyond visual range tactics,”;Eloy;2021 American Control Conference,2020

4. Background interpolation for on-line situation of capture zone of air-to-air missiles;Fang;J. Syst. Eng. Electron,2019

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimizing pursuit strategy for autonomous underwater vehicle considering payload-based capture condition;Ocean Engineering;2024-11

2. Autonomous Air Combat Maneuver Decision-Making Based on PPO-BWDA;IEEE Access;2024

3. Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction;Artificial Intelligence Review;2023-12-28

4. Multi-intent autonomous decision-making for air combat with deep reinforcement learning;Applied Intelligence;2023-10-21

5. Autonomous Agent for Beyond Visual Range Air Combat: A Deep Reinforcement Learning Approach;ACM SIGSIM Conference on Principles of Advanced Discrete Simulation;2023-06-21