Joint Resource Scheduling of the Time Slot, Power, and Main Lobe Direction in Directional UAV Ad Hoc Networks: A Multi-Agent Deep Reinforcement Learning Approach-Reference-Cited by-同舟云学术

Joint Resource Scheduling of the Time Slot, Power, and Main Lobe Direction in Directional UAV Ad Hoc Networks: A Multi-Agent Deep Reinforcement Learning Approach

Published:2024-09-12 Issue:9 Volume:8 Page:478
ISSN:2504-446X
Container-title:Drones
language:en
Short-container-title:Drones

Author:

Liang Shijie¹²^ORCID,Zhao Haitao²^ORCID,Zhou Li²,Wang Zhe²,Cao Kuo²,Wang Junfang¹

Affiliation:

1. The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang 050081, China

2. College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China

Abstract

Directional unmanned aerial vehicle (UAV) ad hoc networks (DUANETs) are widely applied due to their high flexibility, strong anti-interference capability, and high transmission rates. However, within directional networks, complex mutual interference persists, necessitating scheduling of the time slot, power, and main lobe direction for all links to improve the transmission performance of DUANETs. To ensure transmission fairness and the total count of transmitted data packets for the DUANET under dynamic data transmission demands, a scheduling algorithm for the time slot, power, and main lobe direction based on multi-agent deep reinforcement learning (MADRL) is proposed. Specifically, modeling is performed with the links as the core, optimizing the time slot, power, and main lobe direction variables for the fairness-weighted count of transmitted data packets. A decentralized partially observable Markov decision process (Dec-POMDP) is constructed for the problem. To process the observation in Dec-POMDP, an attention mechanism-based observation processing method is proposed to extract observation features of UAVs and their neighbors within the main lobe range, enhancing algorithm performance. The proposed Dec-POMDP and MADRL algorithms enable distributed autonomous decision-making for the resource scheduling of time slots, power, and main lobe directions. Finally, the simulation and analysis are primarily focused on the performance of the proposed algorithm and existing algorithms across varying data packet generation rates, different main lobe gains, and varying main lobe widths. The simulation results show that the proposed attention mechanism-based MADRL algorithm enhances the performance of the MADRL algorithm by 22.17%. The algorithm with the main lobe direction scheduling improves performance by 67.06% compared to the algorithm without the main lobe direction scheduling.

Funder

National Natural Science Foundation of China

National Natural Science Foundation of Hunan Province, China

Publisher

MDPI AG

Link

https://www.mdpi.com/2504-446X/8/9/478/pdf

Reference58 articles.

1. Survey on unmanned aerial vehicle networks: A cyber physical system perspective;Wang;IEEE Commun. Surv. Tutor.,2019

2. Cooperative multiagent deep reinforcement learning for reliable surveillance via autonomous multi-UAV control;Yun;IEEE Trans. Ind. Inform.,2022

3. Liang, S., Zhao, H., Zhang, J., Wang, H., Wei, J., and Wang, J. (2023). A Multichannel MAC Protocol without Coordination or Prior Information for Directional Flying Ad hoc Networks. Drones, 7.

4. A key agreement scheme for IoD deployment civilian drone;Jan;IEEE Access,2019

5. Medium access control protocols for flying ad hoc networks: A review;Arafat;IEEE Sens. J.,2020