Affiliation:
1. College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
Abstract
This paper proposes a reinforcement learning-based power allocation for underwater acoustic communication networks (UACNs). The objective function is formulated as maximizing channel capacity under constraints of maximum power and minimum channel capacity. To solve this problem, a multi-agent deep deterministic policy gradient (MADDPG) approach is introduced, where each transmitter node is considered as an agent. Given the definition of a Markov decision process (MDP) model for this problem, the agents learn to collaboratively maximize the channel capacity by deep deterministic policy gradient (DDPG) learning. Specifically, the power allocation of each agent is obtained by a centralized training and distributed execution (CTDE) method. Simulation results show the sum rate achieved by the proposed algorithm approximates that of the fractional programming (FP) algorithm and improves by at least 5% compared with the DQN (deep Q-learning network) -based power allocation algorithm.
Funder
Innovation Program of Shanghai Municipal Education Commission of China
Shanghai Sailing Program
National Natural Science Foundation of China
Reference24 articles.
1. Underwater internet of things in smart ocean: System architecture and open issues;Qiu;IEEE Trans. Ind. Inform.,2020
2. Channel state information prediction for adaptive underwater acoustic downlink OFDMA system: Deep neural networks based approach;Liu;IEEE Trans. Veh. Technol.,2021
3. Hu, X., Huo, Y., Dong, X., Wu, F.-Y., and Huang, A. (2023). Channel prediction using adaptive bidirectional GRU for underwater MIMO communications. IEEE Internet Things J.
4. Energy-efficient data collection over underwater MI-assisted acoustic cooperative MIMO WSNs;Ren;China Commun.,2023
5. Fractional programming for communication systems—Part I: Power control and beamforming;Shen;IEEE Trans. Signal Process.,2018