Military Decision Support with Actor and Critic Reinforcement Learning Agents-Reference-Cited by-同舟云学术

Military Decision Support with Actor and Critic Reinforcement Learning Agents

Published:2024-02-26 Issue:3 Volume:74 Page:389-398
ISSN:0976-464X
Container-title:Defence Science Journal
language:
Short-container-title:Def. Sc. J.

Author:

Ma Jungmok^ORCID

Abstract

While the recent advanced military operational concept requires an intelligent support of command and control, Reinforcement Learning (RL) has not been actively studied in the military domain. This study points out the limitations of RL for military applications from literature review and aims at improving the understanding of RL for military decision support under the limitations. Most of all, the black box characteristic of Deep RL makes the internal process difficult to understand in addition to complex simulation tools. A scalable weapon selection RL framework is built which can be solved either by a tabular form or a neural network form. The transition of the Deep Q-Network (DQN) solution to the tabular form makes it easier to compare the result to the Q-learning solution. Furthermore, rather than using one or two RL models selectively as before, RL models are divided as an actor and a critic, and systematically compared. A random agent, Q-learning and DQN agents as a critic, a Policy Gradient (PG) agent as an actor, Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) agents as an actor-critic approach are designed, trained, and tested. The performance results show that the trained DQN and PPO agents are the best decision supporter candidates for the weapon selection RL framework.

Publisher

Defence Scientific Information and Documentation Centre