Improved Dyna-Q: A Reinforcement Learning Method Focused via Heuristic Graph for AGV Path Planning in Dynamic Environments-Reference-Cited by-同舟云学术

Improved Dyna-Q: A Reinforcement Learning Method Focused via Heuristic Graph for AGV Path Planning in Dynamic Environments

Published:2022-11-19 Issue:11 Volume:6 Page:365
ISSN:2504-446X
Container-title:Drones
language:en
Short-container-title:Drones

Author:

Liu Yiyang^ORCID,Yan Shuaihua^ORCID,Zhao Yang^ORCID,Song Chunhe^ORCID,Li Fei

Abstract

Dyna-Q is a reinforcement learning method widely used in AGV path planning. However, in large complex dynamic environments, due to the sparse reward function of Dyna-Q and the large searching space, this method has the problems of low search efficiency, slow convergence speed, and even inability to converge, which seriously reduces the performance and practicability of it. To solve these problems, this paper proposes an Improved Dyna-Q algorithm for AGV path planning in large complex dynamic environments. First, to solve the problem of the large search space, this paper proposes a global path guidance mechanism based on heuristic graph, which can effectively reduce the path search space and, thus, improve the efficiency of obtaining the optimal path. Second, to solve the problem of the sparse reward function in Dyna-Q, this paper proposes a novel dynamic reward function and an action selection method based on the heuristic graph, which can provide more intensive feedback and more efficient action decision for AGV path planning, effectively improving the convergence of the algorithm. We evaluated our approach in scenarios with static obstacles and dynamic obstacles. The experimental results show that the proposed algorithm can obtain better paths more efficiently than other reinforcement-learning-based methods including the classical Q-Learning and the Dyna-Q algorithms.

Funder

National Key R&D Program of China

LiaoNing Revitalization Talents Program

Nature Science Foundation of Liaoning province

State Key Laboratory of Robotics

Publisher

MDPI AG

Subject

Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering

Link

https://www.mdpi.com/2504-446X/6/11/365/pdf

Reference35 articles.

1. Sustainable supply chain management in the digitalisation era: The impact of Automated Guided Vehicles;Bechtsis;J. Clean. Prod.,2017

2. Consumption patterns and the advent of automated guided vehicles, and the trends for automated guided vehicles;Patricio;Curr. Robot. Rep.,2020

3. AGV path planning based on improved Dijkstra algorithm;Sun;Proceedings of the Journal of Physics: Conference Series,2021

4. AGV path planning based on smoothing A* algorithm;Yang;Int. J. Softw. Eng. Appl.,2015

5. Wang, C., Wang, L., Qin, J., Wu, Z., Duan, L., Li, Z., Cao, M., Ou, X., Su, X., and Li, W. (2015, January 8–10). Path planning of automated guided vehicles based on improved A-Star algorithm. Proceedings of the 2015 IEEE International Conference on Information and Automation, Lijiang, China.

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Fusion Q-Learning Algorithm for Open Shop Scheduling Problem with AGVs;Mathematics;2024-01-31

2. A Comprehensive Review of Recent Advances in Automated Guided Vehicle Technologies: Dynamic Obstacle Avoidance in Complex Environment Toward Autonomous Capability;IEEE Transactions on Instrumentation and Measurement;2024

3. NT-ARS-RRT: A novel non-threshold adaptive region sampling RRT algorithm for path planning;Journal of King Saud University - Computer and Information Sciences;2023-10

4. A Novel AGV Path Planning Approach for Narrow Channels Based on the Bi-RRT Algorithm with a Failure Rate Threshold;Sensors;2023-08-30

5. Research on AGV path tracking method based on global vision and reinforcement learning;Science Progress;2023-07