Research on a Personalized Decision Control Algorithm for Autonomous Vehicles Based on the Reinforcement Learning from Human Feedback Strategy-Reference-Cited by-同舟云学术

Research on a Personalized Decision Control Algorithm for Autonomous Vehicles Based on the Reinforcement Learning from Human Feedback Strategy

Published:2024-05-24 Issue:11 Volume:13 Page:2054
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Li Ning¹,Chen Pengzhan¹^ORCID

Affiliation:

1. School of Intelligent Manufacturing, Taizhou University, Taizhou 318000, China

Abstract

To address the shortcomings of previous autonomous decision models, which often overlook the personalized features of users, this paper proposes a personalized decision control algorithm for autonomous vehicles based on RLHF (reinforcement learning from human feedback). The algorithm combines two reinforcement learning approaches, DDPG (Deep Deterministic Policy Gradient) and PPO (proximal policy optimization), and divides the control scheme into three phases including pre-training, human evaluation, and parameter optimization. During the pre-training phase, an agent is trained using the DDPG algorithm. In the human evaluation phase, different trajectories generated by the DDPG-trained agent are scored by individuals with different styles, and the respective reward models are trained based on the trajectories. In the parameter optimization phase, the network parameters are updated using the PPO algorithm and the reward values given by the reward model to achieve personalized autonomous vehicle control. To validate the control algorithm designed in this paper, a simulation scenario was built using CARLA_0.9.13 software. The results demonstrate that the proposed algorithm can provide personalized decision control solutions for different styles of people, satisfying human needs while ensuring safety.

Funder

National Natural Science Foundation of China

Zhejiang Provincial Department of Education

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/11/2054/pdf

Reference25 articles.

1. Deep learning based data fusion for sensor fault diagnosis and tolerance in autonomous vehicles;Pan;Chin. J. Mech. Eng.,2021

2. An online estimation of driving style using data-dependent pointer model;Suzdaleva;Transp. Res. Part C Emerg. Technol.,2018

3. A review of research on driving styles and road safety;Sagberg;Hum. Factors,2015

4. Aljaafreh, A., Alshabatat, N., and Al-Din, M.S.N. (2012, January 24–27). Driving style recognition using fuzzy logic. Proceedings of the IEEE International Conference on Vehicular Electronics and Safety (ICVES), Istanbul, Turkey.

5. A Review of Driving Style Recognition Methods from Short-Term and Long-Term Perspectives;Chu;IEEE Trans. Intell. Veh.,2023