Improving Agent Decision Payoffs via a New Framework of Opponent Modeling-Reference-Cited by-同舟云学术

Improving Agent Decision Payoffs via a New Framework of Opponent Modeling

Published:2023-07-11 Issue:14 Volume:11 Page:3062
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Liu Chanjuan¹,Cong Jinmiao¹,Zhao Tianhao¹,Zhu Enqiang²^ORCID

Affiliation:

1. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

2. Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China

Abstract

The payoff of an agent depends on both the environment and the actions of other agents. Thus, the ability to model and predict the strategies and behaviors of other agents in an interactive decision-making scenario is one of the core functionalities in intelligent systems. State-of-the-art methods for opponent modeling mainly use an explicit model of opponents’ actions, preferences, targets, etc., that the primary agent uses to make decisions. It is more important for an agent to increase its payoff than to accurately predict opponents’ behavior. Therefore, we propose a framework synchronizing the opponent modeling and decision making of the primary agent by incorporating opponent modeling into reinforcement learning. For interactive decisions, the payoff depends not only on the behavioral characteristics of the opponent but also the current state. However, confounding the two obscures the effects of state and action, which then cannot be accurately encoded. To this end, state evaluation is separated from action evaluation in our model. The experimental results from two game environments, a simulated soccer game and a real game called quiz bowl, show that the introduction of opponent modeling can effectively improve decision payoffs. In addition, the proposed framework for opponent modeling outperforms benchmark models.

Funder

Natural Science Foundation of Guangdong Province of China

Natural Science Foundation of Liaoning Province of China

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/11/14/3062/pdf

Reference46 articles.

1. Reinforcement Learning With Task Decomposition for Cooperative Multiagent Systems;Sun;IEEE Trans. Neural Netw. Learn. Syst.,2021

2. Mastering the game of Go with deep neural networks and tree search;Silver;Nature,2016

3. Ferreira, J.C.L. (2012). Opponent Modelling in Texas Hold’em: Learning Pre-Flop Strategies in Multiplayer Tables. [Master’s Thesis, University of Porto].

4. XCS with opponent modelling for concurrent reinforcement learners;Chen;Neurocomputing,2020

5. Formation Control With Collision Avoidance Through Deep Reinforcement Learning Using Model-Guided Demonstration;Sui;IEEE Trans. Neural Netw. Learn. Syst.,2021