Dual Policy Distillation-Reference-Cited by-同舟云学术

Dual Policy Distillation

Published:2020-07 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Lai Kwei-Herng¹,Zha Daochen¹,Li Yuening¹,Hu Xia¹

Affiliation:

1. Texas A&M University

Abstract

Policy distillation, which transfers a teacher policy to a student policy has achieved great success in challenging tasks of deep reinforcement learning. This teacher-student framework requires a well-trained teacher model which is computationally expensive. Moreover, the performance of the student model could be limited by the teacher model if the teacher model is not optimal. In the light of collaborative learning, we study the feasibility of involving joint intellectual efforts from diverse perspectives of student models. In this work, we introduce dual policy distillation (DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment and extract knowledge from each other to enhance their learning. The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms, since it is unclear whether the knowledge distilled from an imperfect and noisy peer learner would be helpful. To address the challenge, we theoretically justify that distilling knowledge from a peer learner will lead to policy improvement and propose a disadvantageous distillation strategy based on the theoretical results. The conducted experiments on several continuous control tasks show that the proposed framework achieves superior performance with a learning-based agent and function approximation without the use of expensive teacher models.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PDD: Pruning Neural Networks During Knowledge Distillation;Cognitive Computation;2024-08-31

2. Skill enhancement learning with knowledge distillation;Science China Information Sciences;2024-07-22

3. Online Policy Distillation with Decision-Attention;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

4. Taking complementary advantages: Improving exploration via double self-imitation learning in procedurally-generated environments;Expert Systems with Applications;2024-03

5. Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments;2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML);2023-11-03