Abstract
In this paper, we propose a novel Deep Reinforcement Learning (DRL) algorithm which can navigate non-holonomic robots with continuous control in an unknown dynamic environment with moving obstacles. We call the approach MK-A3C (Memory and Knowledge-based Asynchronous Advantage Actor-Critic) for short. As its first component, MK-A3C builds a GRU-based memory neural network to enhance the robot’s capability for temporal reasoning. Robots without it tend to suffer from a lack of rationality in face of incomplete and noisy estimations for complex environments. Additionally, robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model. Secondly, MK-A3C combines the domain knowledge-based reward function and the transfer learning-based training task architecture, which can solve the non-convergence policies problems caused by sparse reward. These improvements of MK-A3C can efficiently navigate robots in unknown dynamic environments, and satisfy kinetic constraints while handling moving objects. Simulation experiments show that compared with existing methods, MK-A3C can realize successful robotic navigation in unknown and challenging environments by outputting continuous acceleration commands.
Funder
National Science Foundation of Hunan Province
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Cited by
51 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献