Agent behavior modeling method based on reinforcement learning and human in the loop

Author:

Huang Lin1ORCID,Gong Li1

Affiliation:

1. Navy University of Engineering , Wuhan 430033, China

Abstract

Computer generated force (CGF) is one of the increasingly important research topics in the field of simulation. However, low modeling efficiency and lack of adaptability are acute problems of traditional CGF modeling. In this study, a method for modeling the agent behavior based on reinforcement learning and human in the loop is proposed to improve the ability and efficiency of agent behavior modeling. First, an overall framework for modeling the behavior of intelligent agents is constructed based on the deep reinforcement learning algorithm Soft Actor Critic (SAC) framework. Second, in order to overcome the slow convergence speed of the SAC framework, a method for human interaction and value evaluation in the loop is introduced, and the specific algorithm flow is designed. Third, in order to verify the performance of the proposed method, experiments are conducted and compared with algorithms using a pure SAC framework based on an example of agent completing specific tasks. Result shows that after 100 episodes of training, the task completion rate of the agent can approach 100% while a pure SAC framework require at least 500 episodes of training to gradually improve the completion rate. Finally, the results demonstrate that the proposed method can significantly improve the efficiency of agent behavior modeling and the task completion rate increases with the number of human interventions in the loop.

Publisher

AIP Publishing

Subject

General Physics and Astronomy

Reference31 articles.

1. Effective behaviour modelling for computer generated forces,2019

2. Are we machine learning yet? Computer generated forces with learning capabilities in military simulation: Interservice/industry training, simulation, and education conference,2021

3. Data-driven behavioural modelling for military applications;J. Def. Secur. Technol.,2022

4. Behavior modeling for autonomous agents based on modified evolving behavior trees,2018

5. Dynamic scripting with team coordination in air combat simulation,2014

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3