Intelligent path planning of mobile robot based on Deep Deterministic Policy Gradient-Reference-Cited by-同舟云学术

Intelligent path planning of mobile robot based on Deep Deterministic Policy Gradient

Published:2022-12-06 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Gong Hui¹,Wang Peng¹,Ni Cui¹,Cheng Nuo¹,Wang Hua²

Affiliation:

1. Shandong Jiaotong University

2. Shandong University of Traditional Chinese Medicine

Abstract

Abstract Deep Deterministic Policy Gradient (DDPG) is a deep reinforcement learning algorithm that is widely used in the path planning of mobile robots. It solves the continuous action space problem and can ensure the continuity of mobile robot motion using the Actor-Critic framework, which has great potential in the field of mobile robot path planning. However, because the Critic network always selects the maximum Q value to evaluate the actions of mobile robot, there is the problem of inaccurate Q value estimation. In addition, DDPG adopts a random uniform sampling method, which can’t efficiently use the more important sample data, resulting in slow convergence speed during the training of the path planning model and easily falling into local optimum. In this paper, a dueling network is introduced based on DDPG to improve the estimation accuracy of the Q value, and the reward function is optimized to increase the immediate reward, to direct the mobile robot to move faster toward the target point. To further improve the efficiency of experience replay, a single experience pool is separated into two by comprehensively considering the influence of average reward and TD-error on the importance of samples, and a dynamic adaptive sampling mechanism is adopted to sample the two experience pools separately. Finally, experiments were carried out in the simulation environment created with the ROS system and the Gazebo platform. The results of the experiments show that the proposed path planning algorithm has a fast convergence speed and high stability, and the success rate can reach 100% and 93% in the environment without obstacles and with obstacles, respectively.

Publisher

Research Square Platform LLC

Reference36 articles.

1. 1. Chen J, Du C, Zhang Y, et al. A clustering-based coverage path planning method for autonomous heterogeneous UAVs[J]. IEEE Transactions on Intelligent Transportation Systems, 2021.

2. 2. Liu L, Lin J, Yao J, et al. Path planning for smart car based on Dijkstra algorithm and dynamic window approach[J]. Wireless Communications and Mobile Computing, 2021, 2021.

3. 3. Bagheri S M, Taghaddos H, Mousaei A, et al. An A-Star algorithm for semi-optimization of crane location and configuration in modular construction[J]. Automation in Construction, 2021, 121: 103447.

4. 4. Duhé J F, Victor S, Melchior P. Contributions on artificial potential field method for effective obstacle avoidance[J]. Fractional Calculus and Applied Analysis, 2021, 24(2): 421–446.

5. 5. Han S, Xiao L. An improved adaptive genetic algorithm[C]//SHS Web of Conferences. EDP Sciences, 2022, 140: 01044.

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Autonomous Navigation of Robots: Optimization with DQN;Applied Sciences;2023-06-16

2. SLP-Improved DDPG Path-Planning Algorithm for Mobile Robot in Large-Scale Dynamic Environment;Sensors;2023-03-28

3. Multi-Robot Collaborative Flexible Manufacturing and Digital Twin System Design of Circuit Breakers;Applied Sciences;2023-02-20

4. End-to-End One-Shot Path-Planning Algorithm for an Autonomous Vehicle Based on a Convolutional Neural Network Considering Traversability Cost;Sensors;2022-12-10

5. Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments;Sensors;2022-12-07