A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving Policy

Author:

Si Huaiwei1ORCID,Tan Guozhen1ORCID,Zuo Hao1

Affiliation:

1. School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Ganjingzi District, Dalian City, Liaoning Province 116024, China

Abstract

With the growing up of Internet of Things technology, the application of Internet of Things has been popularized in the field of intelligent vehicles. Therefore, more artificial intelligence algorithms, especially DRL methods, are more widely used in autonomous driving. A large number of deep reinforcement learning (RL) technologies are continuously applied to the behavior planning module of single-vehicle autonomous driving in early. However, autonomous driving is an environment where multi-intelligent vehicles coexist, interact with each other, and dynamically change. In this environment, multiagent RL technology is one of the most promising technologies for solving the coordination behavior planning problem of multivehicles. However, the research related to this topic is rare. This paper introduces a dynamic coordination graph (CG) convolution technology for the cooperative learning of multi-intelligent vehicles. This method dynamically constructs a CG model among multiple vehicles, effectively reducing the impact of unrelated intelligent vehicles and simplifying the learning process. The relationship between intelligent vehicles is refined using the attention mechanism, and the graph convolution RL technology is used to simulate the message-passing aggregation algorithm to maximize the local utility and obtain the maximum joint utility to guide coordination learning. Driving samples are used as training data, and the model guided by reward shaping is combined with the model of the free graph convolution RL method, which enables our proposed method to achieve high gradualness and improve its learning efficiency. In addition, as the graph convolutional RL algorithm shares parameters between agents, it can easily build scales that are suitable for large-scale multiagent systems, such as traffic environments. Finally, the proposed algorithm is tested and verified for the multivehicle cooperative lane-changing problem in the simulation environment of autonomous driving. Experimental results show that our proposed method has better value function representation in that it can learn better coordination driving policies than traditional dynamic coordination algorithms.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3