5G Multi-Slices Bi-Level Resource Allocation by Reinforcement Learning-Reference-Cited by-同舟云学术

5G Multi-Slices Bi-Level Resource Allocation by Reinforcement Learning

Published:2023-02-02 Issue:3 Volume:11 Page:760
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Yu Zhipeng¹,Gu Fangqing¹^ORCID,Liu Hailin¹,Lai Yutao¹

Affiliation:

1. School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China

Abstract

As the centralized unit (CU)—distributed unit (DU) separation in the fifth generation mobile network (5G), the multi-slice and multi-scenario, can be better applied in wireless communication. The development of the 5G network to vertical industries makes its resource allocation also have an obvious hierarchical structure. In this paper, we propose a bi-level resource allocation model. The up-level objective in this model refers to the profit of the 5G operator through the base station allocating resources to slices. The lower-level objective in this model refers to the slices allocating the resource to its users fairly. The resource allocation problem is a complex optimization problem with mixed-discrete variables, so whether a resource allocation algorithm can quickly and accurately give the resource allocation scheme is the key to its practical application. According to the characteristics of the problem, we select the multi-agent twin delayed deep deterministic policy gradient (MATD3) to solve the upper slice resource allocation and the discrete and continuous twin delayed deep deterministic policy gradient (DCTD3) to solve the lower user resource allocation. It is crucial to accurately characterize the state, environment, and reward of reinforcement learning for solving practical problems. Thus, we provide an effective definition of the environment, state, action, and reward of MATD3 and DCTD3 for solving the bi-level resource allocation problem. We conduct some simulation experiments and compare it with the multi-agent deep deterministic policy gradient (MADDPG) algorithm and nested bi-level evolutionary algorithm (NBLEA). The experimental results show that the proposed algorithm can quickly provide a better resource allocation scheme.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Guangdong Province

Programme of Science and Technology of Guangdong Province

Publisher

MDPI AG

Subject

General Medicine

Link

https://www.mdpi.com/2227-7390/11/3/760/pdf

Reference50 articles.

1. Energy-Efficient Secure Short-Packet Transmission in NOMA-Assisted mMTC Networks With Relaying;Lv;IEEE Trans. Veh. Technol.,2022