Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method-Reference-Cited by-同舟云学术

Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method

Published:2023-10-19 Issue:10 Volume:7 Page:641
ISSN:2504-446X
Container-title:Drones
language:en
Short-container-title:Drones

Author:

Wang Dongyu¹,Liu Yue¹,Yu Hongda¹,Hou Yanzhao²

Affiliation:

1. The Key Laboratory of Universal Wireless Communication, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China

2. Shenzhen Institute, Beijing University of Posts and Telecommunications, Shenzhen 518055, China

Abstract

Unmanned aerial vehicles (UAVs) are able to act as movable aerial base stations to enhance wireless coverage for edge users with poor ground communication quality. However, in urban environments, the link between UAVs and ground users can be blocked by obstacles, especially when complicated terrestrial infrastructures increase the probability of non-line-of-sight (NLoS) links. In this paper, in order to improve the average throughput, we propose a multi-UAV multicast system, where a multi-agent reinforcement learning method is utilized to help UAVs determine the optimal altitude and trajectory. Intelligent reflective surfaces (IRSs) are also employed to reflect signals to solve the blocking problem. Furthermore, since the UAV’s onboard power is limited, this paper aims to minimize the UAVs’ energy consumption and maximize the transmission rate for edge users by jointly optimizing the UAVs’ 3D trajectory and transmit power. Firstly, we deduce the channel capacity of ground users in different multicast groups. Subsequently, the K-medoids algorithm is utilized for the multicast grouping problem of edge users based on transmission rate requirements. Then, we employ the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm to learn an optimal solution and eliminate the non-stationarity of multi-agent training. Finally, the simulation results show that the proposed system can increase the average throughput by 14% approximately compared to the non-grouping system, and the MADDPG algorithm can achieve a 20% improvement in reducing the energy consumption of UAVs compared to traditional deep reinforcement learning (DRL) methods.

Funder

National Key R&D Program of China

Shenzhen Science and Technology Innovation Commission Free Exploring Basic Research Project

Publisher

MDPI AG

Subject

Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering

Link

https://www.mdpi.com/2504-446X/7/10/641/pdf

Reference26 articles.

1. Abubakar, A.I., Ahmad, I., Omeke, K.G., Ozturk, M., Ozturk, C., Abdel-Salam, A.M., Mollel, M.S., Abbasi, Q.H., Hussain, S., and Imran, M.A. (2023). A Survey on Energy Optimization Techniques in UAV-Based Cellular Networks: From Conventional to Machine Learning Approaches. Drones, 7.

2. Wu, Y., Xu, J., Qiu, L., and Zhang, R. (2018, January 20–24). Capacity of UAV-Enabled Multicast Channel: Joint Trajectory Design and Power Allocation. Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas City, MO, USA.

3. Optimal LAP Altitude for Maximum Coverage;Kandeepan;IEEE Wirel. Commun. Lett.,2014

4. Constrained k-means clustering;Bradley;Microsoft Res.,2000

5. Trajectory Design for the Aerial Base Stations to Improve Cellular Network Performance;Khamidehi;IEEE Trans. Veh. Technol.,2021

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Energy-Efficient Device-to-Device Communications for Green Internet of Things Using Unmanned Aerial Vehicle-Mounted Intelligent Reflecting Surface;Drones;2024-03-26