Policy Compression for Intelligent Continuous Control on Low-Power Edge Devices-Reference-Cited by-同舟云学术

Policy Compression for Intelligent Continuous Control on Low-Power Edge Devices

Published:2024-07-27 Issue:15 Volume:24 Page:4876
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Avé Thomas¹^ORCID,De Schepper Tom²^ORCID,Mets Kevin³^ORCID

Affiliation:

1. IDLab—Department of Computer Science, University of Antwerp—IMEC, Sint-Pietersvliet 7, 2000 Antwerp, Belgium

2. AI & Data Department, IMEC, 3001 Leuven, Belgium

3. IDLab—Faculty of Applied Engineering, University of Antwerp—IMEC, Sint-Pietersvliet 7, 2000 Antwerp, Belgium

Abstract

Interest in deploying deep reinforcement learning (DRL) models on low-power edge devices, such as Autonomous Mobile Robots (AMRs) and Internet of Things (IoT) devices, has seen a significant rise due to the potential of performing real-time inference by eliminating the latency and reliability issues incurred from wireless communication and the privacy benefits of processing data locally. Deploying such energy-intensive models on power-constrained devices is not always feasible, however, which has led to the development of model compression techniques that can reduce the size and computational complexity of DRL policies. Policy distillation, the most popular of these methods, can be used to first lower the number of network parameters by transferring the behavior of a large teacher network to a smaller student model before deploying these students at the edge. This works well with deterministic policies that operate using discrete actions. However, many real-world tasks that are power constrained, such as in the field of robotics, are formulated using continuous action spaces, which are not supported. In this work, we improve the policy distillation method to support the compression of DRL models designed to solve these continuous control tasks, with an emphasis on maintaining the stochastic nature of continuous DRL algorithms. Experiments show that our methods can be used effectively to compress such policies up to 750% while maintaining or even exceeding their teacher’s performance by up to 41% in solving two popular continuous control tasks.

Funder

Research Foundation Flanders

euROBIN

OpenSwarm

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/15/4876/pdf

Reference34 articles.

1. Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., and Du, B. (2023, January 18–20). Deep Reinforcement Learning for Delay and Energy-Aware Task Scheduling in Edge Clouds. Proceedings of the Computer Supported Cooperative Work and Social Computing, Harbin, China.

2. Enhancing Sustainable Edge Computing Offloading via Renewable Prediction for Energy Harvesting;Alhartomi;IEEE Access,2024

3. Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing Systems;Tang;IEEE Trans. Mob. Comput.,2022

4. Deep Reinforcement Learning for 5G Networks: Joint Beamforming, Power Control, and Interference Coordination;Mismar;IEEE Trans. Commun.,2020

5. Avé, T., Soto, P., Camelo, M., De Schepper, T., and Mets, K. (2024, January 6–10). Policy Compression for Low-Power Intelligent Scaling in Software-Based Network Architectures. Proceedings of the NOMS 2024 IEEE Network Operations and Management Symposium, Seoul, Republic of Korea.