Affiliation:
1. School of Mechanical Electronic & Information Engineering, China University of Mining and Technology-Beijing, Beijing 100083, China
Abstract
It is challenging to ensure satisfying co-evolution efficiency for the multi-agents in dynamic environments since during Actor-Critic training there is a high probability of falling into local optimality, failing to adapt to the suddenly changed environment quickly. To solve this problem, this paper proposes a multi-agent adaptive co-evolution method in dynamic environments (ACE-D) based on the classical multi-agent reinforcement learning method MADDPG, which effectively realizes self-adaptive new environments and co-evolution in dynamic environments. First, an experience screening policy is introduced based on the MADDPG method to reduce the negative influence of original environment experience on exploring new environments. Then, an adaptive weighting policy is applied to the policy network, which accordingly generates benchmarks for varying environments and assigns higher weights to those policies that are more beneficial for new environments exploration, so that to save time while promoting adaptability of the agents. Finally, different types of dynamic environments with complexity at different levels are built to verify the co-evolutionary effects of the two policies separately and the ACE-D method comprehensively. The experimental results demonstrate that, compared with a range of other methods, the ACE-D method has obvious advantages helping multi-agent adapt to dynamic environments and preventing them from falling into local optima, with more than 25% improvement in stable reward and more than 23% improvement in training efficiency. The ACE-D method is valuable and commendable to promote the co-evolutionary effect of multi-agent in dynamic environments.
Funder
National Natural Science Foundation of China
Fundamental Research Funds for the Central Universities
National Training Program of Innovation and Entrepreneurship for Undergraduates
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference35 articles.
1. Xue, H.T., and Lincheng, S.C. (2007, January 26–31). Multi-agent system based on co-evolution method and its’ symbol deduction theory model. Proceedings of the 26th Chinese Control Conference, Zhangjiajie, China.
2. Li, Y., Zhao, M.Y., Zhang, H.Z., Yang, F.L., and Wang, S.Y. (2021). An Interactive Self-Learning Game and Evolutionary Approach Based on Non-Cooperative Equilibrium. Electronics, 10.
3. Li, Y., Zhao, M.Y., Zhang, H.Z., Qu, Y.Y., and Wang, S.Y. (2022). A Multi-Agent Motion Prediction and Tracking Method Based on Non-Cooperative Equilibrium. Mathematics, 10.
4. Incremental Reinforcement Learning with Prioritized Sweeping for Dynamic Environments;Wang;IEEE/ASME Trans. Mechatron.,2019
5. Incremental Reinforcement Learning in Continuous Spaces via Policy Relaxation and Importance Weighting;Wang;IEEE Trans. Neural Netw. Learn. Syst.,2020