Decoupled Monte Carlo Tree Search for Cooperative Multi-Agent Planning

Author:

Asik Okan1ORCID,Aydemir Fatma Başak1ORCID,Akın Hüseyin Levent1

Affiliation:

1. Department of Computer Engineering, Bogazici University, Istanbul 34342, Turkey

Abstract

The number of agents exponentially increases the complexity of a cooperative multi-agent planning problem. Decoupled planning is one of the viable approaches to reduce this complexity. By integrating decoupled planning with Monte Carlo Tree Search, we present a new scalable planning approach. The search tree maintains the updates of the individual actions of each agent separately. However, this separation brings coordination and action synchronization problems. When the agent does not know the action of the other agent, it uses the returned reward to deduce the desirability of its action. When a deterministic action selection policy is used in the Monte Carlo Tree Search algorithm, the actions of agents are synchronized. Of all possible action combinations, only some of them are evaluated. We show the effect of action synchronization on different problems and propose stochastic action selection policies. We also propose a combined method as a pruning step in centralized planning to address the coordination problem in decoupled planning. We create a centralized search tree with a subset of joint actions selected by the evaluation of decoupled planning. We empirically show that decoupled planning has a similar performance compared to a central planning algorithm when stochastic action selection is used in repeated matrix games and multi-agent planning problems. We also show that the combined method improves the performance of the decoupled method in different problems. We compare the proposed method to a decoupled method in regard to a warehouse commissioning problem. Our method achieved more than 10% improvement in performance.

Funder

the Turkish Directorate of Strategy and Budget under the TAM Project

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference40 articles.

1. The dynamics of reinforcement learning in cooperative multiagent systems;Claus;AAAI/IAAI,1998

2. Simulation-Based Approach to General Game Playing;Finnsson;AAAI,2008

3. Shafiei, M., Sturtevant, N., and Schaeffer, J. (2009, January 11–27). Comparing UCT versus CFR in simultaneous games. Proceedings of the IJCAI-09 Workshop on General Game Playing (GIGA’09), Pasadena, CA, USA.

4. Teytaud, O., and Flory, S. (2011, January 27–29). Upper confidence trees with short term partial information. Proceedings of the European Conference on the Applications of Evolutionary Computation, Torino, Italy.

5. Auger, D. (2011, January 27–29). Multiple tree for partially observable monte-carlo tree search. Proceedings of the European Conference on the Applications of Evolutionary Computation, Torino, Italy.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3