Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle-Reference-Cited by-同舟云学术

Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle

Published:2019-09-30 Issue:1 Volume:103 Page:003685041987902
ISSN:0036-8504
Container-title:Science Progress
language:en
Short-container-title:Science Progress

Author:

Xie Ronglei¹^ORCID,Meng Zhijun¹,Zhou Yaoming¹,Ma Yunpeng¹,Wu Zhe¹

Affiliation:

1. School of Aeronautic Science and Engineering, Beihang University, Beijing, China

Abstract

In order to solve the problem that the existing reinforcement learning algorithm is difficult to converge due to the excessive state space of the three-dimensional path planning of the unmanned aerial vehicle, this article proposes a reinforcement learning algorithm based on the heuristic function and the maximum average reward value of the experience replay mechanism. The knowledge of track performance is introduced to construct heuristic function to guide the unmanned aerial vehicles’ action selection and reduce the useless exploration. Experience replay mechanism based on maximum average reward increases the utilization rate of excellent samples and the convergence speed of the algorithm. The simulation results show that the proposed three-dimensional path planning algorithm has good learning efficiency, and the convergence speed and training performance are significantly improved.

Funder

National Natural Science Foundation of China

Publisher

SAGE Publications

Subject

Multidisciplinary

Link

http://journals.sagepub.com/doi/pdf/10.1177/0036850419879024

Reference12 articles.

1. Autonomous mobile robot dynamic motion planning using hybrid fuzzy potential field

2. A new vibrational genetic algorithm enhanced with a Voronoi diagram for path planning of autonomous UAV

3. Comparison of Parallel Genetic Algorithm and Particle Swarm Optimization for Real-Time UAV Path Planning

4. Reinforcement Learning: An Introduction

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An adaptive Q-learning based particle swarm optimization for multi-UAV path planning;Soft Computing;2024-07

2. Energy Efficient Path Planning Scheme for Unmanned Aerial Vehicle Using Hybrid Generic Algorithm-Based Q-Learning Optimization;IEEE Access;2024

3. Reinforcement learning-based UAV Swarm motion planning for field monitoring;2023 IEEE 20th India Council International Conference (INDICON);2023-12-14

4. UAV 3D online track planning based on improved SAC algorithm;Journal of the Brazilian Society of Mechanical Sciences and Engineering;2023-12-09

5. Map Optimization of Path Planning in Q-Learning;Highlights in Science, Engineering and Technology;2023-08-08