A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS-Reference-Cited by-同舟云学术

A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS

Published:2023-08-08 Issue:16 Volume:11 Page:3437
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Wei Huang¹,Peng Yuyang¹,Yue Ming¹,Long Jiale²,AL-Hazemi Fawaz³^ORCID,Mirza Mohammad Meraj⁴^ORCID

Affiliation:

1. The School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China

2. Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China

3. Department of Computer and Network Engineering, University of Jeddah, Jeddah 21959, Saudi Arabia

4. Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

Abstract

In recent years, the Internet of Vehicles (IoV) has been found to be of huge potential value in the promotion of the development of intelligent transportation systems (ITSs) and smart cities. However, the traditional scheme in IoV has difficulty in dealing with an uncertain environment, while reinforcement learning has the advantage of being able to deal with an uncertain environment. Spectrum resource allocation in IoV faces the uncertain environment in most cases. Therefore, this paper investigates the spectrum resource allocation problem by deep reinforcement learning after using spectrum sensing technology in the ITS, including the vehicle-to-infrastructure (V2I) link and the vehicle-to-vehicle (V2V) link. The spectrum resource allocation is modeled as a reinforcement learning-based multi-agent problem which is solved by using the soft actor critic (SAC) algorithm. Considered an agent, each V2V link interacts with the vehicle environment and makes a joint action. After that, each agent receives different observations as well as the same reward, and updates networks through the experiences from the memory. Therefore, during a certain time, each V2V link can optimize its spectrum allocation scheme to maximize the V2I capacity as well as increase the V2V payload delivery transmission rate. However, the number of SAC networks increases linearly as the number of V2V links increases, which means that the networks may have a problem in terms of convergence when there are an excessive number of V2V links. Consequently, a new algorithm, namely parameter sharing soft actor critic (PSSAC), is proposed to reduce the complexity for which the model is easier to converge. The simulation results show that both SAC and PSSAC can improve the V2I capacity and increase the V2V payload transmission success probability within a certain time. Specifically, these novel schemes have a 10 percent performance improvement compared with the existing scheme in the vehicular environment. Additionally, PSSAC has a lower complexity.

Funder

The Science and Technology Development Fund, Macau SAR

Wuyi University-Hong Kong-Macau joint Research and Development Fund

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/11/16/3437/pdf

Reference22 articles.

1. Deep reinforcement learning based resource allocation algorithm in cellular networks;Liao;J. Commun.,2019

2. Research of dynamic channel allocation algorithm for multi-radio multi-channel VANET;Min;Appl. Res. Comput.,2014

3. Cognitive Spectrum Allocation Mechanism in Internet of Vehicles Based on Clustering Structure;Xue;Comput. Sci.,2019

4. Radio resource management for D2D-based V2V communication;Sun;IEEE Trans. Veh. Technol.,2016

5. Cluster-based radio resource management for D2D-supported safety-critical V2X communications;Sun;IEEE Trans. Wirel. Commun.,2016