Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning-Reference-Cited by-同舟云学术

Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning

Published:2023-07-31 Issue:15 Volume:23 Page:6822
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Zhang Chongli¹,Lv Tiejun¹^ORCID,Huang Pingmu²,Lin Zhipeng³^ORCID,Zeng Jie⁴,Ren Yuan⁵^ORCID

Affiliation:

1. School of Information and Communication Engineering, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China

2. School of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China

3. Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space, College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing 211106, China

4. School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing 100081, China

5. Shaanxi Key Laboratory of Information Communication Network and Security, School of Communications and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China

Abstract

Wireless resource utilizations are the focus of future communication, which are used constantly to alleviate the communication quality problem caused by the explosive interference with increasing users, especially the inter-cell interference in the multi-cell multi-user systems. To tackle this interference and improve the resource utilization rate, we proposed a joint-priority-based reinforcement learning (JPRL) approach to jointly optimize the bandwidth and transmit power allocation. This method aims to maximize the average throughput of the system while suppressing the co-channel interference and guaranteeing the quality of service (QoS) constraint. Specifically, we de-coupled the joint problem into two sub-problems, i.e., the bandwidth assignment and power allocation sub-problems. The multi-agent double deep Q network (MADDQN) was developed to solve the bandwidth allocation sub-problem for each user and the prioritized multi-agent deep deterministic policy gradient (P-MADDPG) algorithm by deploying a prioritized replay buffer that is designed to handle the transmit power allocation sub-problem. Numerical results show that the proposed JPRL method could accelerate model training and outperform the alternative methods in terms of throughput. For example, the average throughput was approximately 10.4–15.5% better than the homogeneous-learning-based benchmarks, and about 17.3% higher than the genetic algorithm.

Funder

National Natural Science Foundation of China

Beijing Natural Science Foundation

Basic Scientific Research Project

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/15/6822/pdf

Reference45 articles.

1. Liu, G., Cai, B., and Xie, W. (2021, January 4–6). Research on 5G Wireless Networks and Evolution. Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Chengdu, China.

2. Survey and Performance Evaluation of Multiple Access Schemes for Next-Generation Wireless Communication Systems;Shah;IEEE Access,2021

3. Terahertz Communications: Challenges in the Next Decade;Song;IEEE Trans. Terahertz. Sci. Technol.,2022

4. Cellular, Wide-Area, and Non-Terrestrial IoT: A Survey on 5G Advances and the Road Toward 6G;Vaezi;IEEE Commun. Surv. Tutor.,2022

5. Toward Massive Machine Type Communications in Ultra-Dense Cellular IoT Networks: Current Issues and Machine Learning-Assisted Solutions;Sharma;IEEE Commun. Surv. Tutor.,2020