Affiliation:
1. Department of CS, Hong Kong Baptist University, Hong Kong SAR, China
2. CMIC, Shanghai Jiao Tong University & Shanghai AI Laboratory, China
3. School of CSE, Nanjing University of Science and Technology, China
Abstract
Federated Learning (FL) learns a global model in a distributional manner, which does not require local clients to share private data. Such merit has drawn lots of attention in the interaction scenarios, where Federated Reinforcement Learning (FRL) emerges as a cross-field research direction focusing on the robust training of agents. Different from FL, the heterogeneity problem in FRL is more challenging because the data depends on the policy of agents and the environment dynamics. FRL learns to interact under the non-stationary environment feedback, while the typical FL methods aim at handling the constant data heterogeneity. In this article, we are among the first attempts to analyze the heterogeneity problem in FRL and propose an off-policy FRL framework. Specifically, a student–teacher–student model learning and fusion method, termed as
Server-Client Collaborative Distillation
(SCCD), is introduced. Unlike the traditional FL, we distill all local models on the server side for model fusion. To reduce the variance of the training, a local distillation is also conducted every time the agent receives the global model. Experimentally, we compare SCCD with a range of straightforward combinations between FL methods and RL. The results demonstrate that SCCD has a superior performance in four classical continuous control tasks with non-IID environments.
Funder
NSFC Young Scientists Fund
Guangdong Basic and Applied Basic Research Foundation
RGC Early Career Scheme
CAAI-Huawei MindSpore Open Fund, and HKBU CSD Departmental Incentive Grant
NSFC/Research Grants Council (RGC) Joint Research Scheme
General Research Fund of RGC
Hong Kong Baptist University
National Key R&D Program of China
STCSM
111 plan
NSF of China
NSF of Jiangsu Province
Fundamental Research Funds for the Central Universities
Publisher
Association for Computing Machinery (ACM)
Reference49 articles.
1. Maruan Al-Shedivat Trapit Bansal Yura Burda Ilya Sutskever Igor Mordatch and Pieter Abbeel. 2018. Continuous adaptation via meta-learning in nonstationary and competitive environments. In ICLR OpenReview.net.
2. Aqeel Anwar and Arijit Raychowdhury. 2021. Multi-task federated reinforcement learning with adversaries. CoRR abs/2103.06473 (2021).
3. The theory of dynamic programming
4. Reinforcement learning for game personalization on edge devices
5. Brendan McMahan Eider Moore Daniel Ramage Seth Hampson and Blaise Agüera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In AISTATS (Proceedings of Machine Learning Research) PMLR 1273–1282.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献