Server-Client Collaborative Distillation for Federated Reinforcement Learning

Author:

Mai Weiming1ORCID,Yao Jiangchao2ORCID,Chen Gong3ORCID,Zhang Ya2ORCID,Cheung Yiu-Ming1ORCID,Han Bo1ORCID

Affiliation:

1. Department of CS, Hong Kong Baptist University, Hong Kong SAR, China

2. CMIC, Shanghai Jiao Tong University & Shanghai AI Laboratory, China

3. School of CSE, Nanjing University of Science and Technology, China

Abstract

Federated Learning (FL) learns a global model in a distributional manner, which does not require local clients to share private data. Such merit has drawn lots of attention in the interaction scenarios, where Federated Reinforcement Learning (FRL) emerges as a cross-field research direction focusing on the robust training of agents. Different from FL, the heterogeneity problem in FRL is more challenging because the data depends on the policy of agents and the environment dynamics. FRL learns to interact under the non-stationary environment feedback, while the typical FL methods aim at handling the constant data heterogeneity. In this article, we are among the first attempts to analyze the heterogeneity problem in FRL and propose an off-policy FRL framework. Specifically, a student–teacher–student model learning and fusion method, termed as Server-Client Collaborative Distillation (SCCD), is introduced. Unlike the traditional FL, we distill all local models on the server side for model fusion. To reduce the variance of the training, a local distillation is also conducted every time the agent receives the global model. Experimentally, we compare SCCD with a range of straightforward combinations between FL methods and RL. The results demonstrate that SCCD has a superior performance in four classical continuous control tasks with non-IID environments.

Funder

NSFC Young Scientists Fund

Guangdong Basic and Applied Basic Research Foundation

RGC Early Career Scheme

CAAI-Huawei MindSpore Open Fund, and HKBU CSD Departmental Incentive Grant

NSFC/Research Grants Council (RGC) Joint Research Scheme

General Research Fund of RGC

Hong Kong Baptist University

National Key R&D Program of China

STCSM

111 plan

NSF of China

NSF of Jiangsu Province

Fundamental Research Funds for the Central Universities

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Reference49 articles.

1. Maruan Al-Shedivat Trapit Bansal Yura Burda Ilya Sutskever Igor Mordatch and Pieter Abbeel. 2018. Continuous adaptation via meta-learning in nonstationary and competitive environments. In ICLR OpenReview.net.

2. Aqeel Anwar and Arijit Raychowdhury. 2021. Multi-task federated reinforcement learning with adversaries. CoRR abs/2103.06473 (2021).

3. The theory of dynamic programming

4. Reinforcement learning for game personalization on edge devices

5. Brendan McMahan Eider Moore Daniel Ramage Seth Hampson and Blaise Agüera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In AISTATS (Proceedings of Machine Learning Research) PMLR 1273–1282.

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3