A Multi-Agent Deep Reinforcement Learning-Based Popular Content Distribution Scheme in Vehicular Networks

Author:

Chen Wenwei1,Huang Xiujie12ORCID,Guan Quanlong12,Zhao Shancheng12

Affiliation:

1. College of Information Science and Technology, Jinan University, Guangzhou 510632, China

2. Guangdong Key Laboratory of Data Security and Privacy Preserving, Guangzhou 511443, China

Abstract

The Internet of Vehicles (IoV) enables vehicular data services and applications through vehicle-to-everything (V2X) communications. One of the key services provided by IoV is popular content distribution (PCD), which aims to quickly deliver popular content that most vehicles request. However, it is challenging for vehicles to receive the complete popular content from roadside units (RSUs) due to their mobility and the RSUs’ constrained coverage. The collaboration of vehicles via vehicle-to-vehicle (V2V) communications is an effective solution to assist more vehicles to obtain the entire popular content at a lower time cost. To this end, we propose a multi-agent deep reinforcement learning (MADRL)-based popular content distribution scheme in vehicular networks, where each vehicle deploys an MADRL agent that learns to choose the appropriate data transmission policy. To reduce the complexity of the MADRL-based algorithm, a vehicle clustering algorithm based on spectral clustering is provided to divide all vehicles in the V2V phase into groups, so that only vehicles within the same group exchange data. Then the multi-agent proximal policy optimization (MAPPO) algorithm is used to train the agent. We introduce the self-attention mechanism when constructing the neural network for the MADRL to help the agent accurately represent the environment and make decisions. Furthermore, the invalid action masking technique is utilized to prevent the agent from taking invalid actions, accelerating the training process of the agent. Finally, experimental results are shown and a comprehensive comparison is provided, which demonstrates that our MADRL-PCD scheme outperforms both the coalition game-based scheme and the greedy strategy-based scheme, achieving a higher PCD efficiency and a lower transmission delay.

Funder

Science and Technology Planning Project of Guangdong

Guangdong Provincial NSF

Science and Technology Planning Project of Guangzhou

Key Laboratory of Smart Education of Guangdong Higher Education Institutes, Jinan University

Jinan University

Opening Project of Key Laboratory of Safety of Intelligent Robots for State Market Regulation

NSFC

Publisher

MDPI AG

Subject

General Physics and Astronomy

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3