1. Updet: Universal multi-agent reinforcement learning via policy decoupling with transformers;hu,2021
2. Attention is all you need;vaswani;Advances in neural information processing systems,2017
3. Value-decomposition networks for cooperative multi-agent learning;sunehag,2017
4. Emergence of Grounded Compositional Language in Multi-Agent Populations
5. High-dimensional continuous control using generalized advantage estimation;schulman,2015