End-to-End AUV Local Motion Planning Method Based on Deep Reinforcement Learning-Reference-Cited by-同舟云学术

End-to-End AUV Local Motion Planning Method Based on Deep Reinforcement Learning

Published:2023-09-14 Issue:9 Volume:11 Page:1796
ISSN:2077-1312
Container-title:Journal of Marine Science and Engineering
language:en
Short-container-title:JMSE

Author:

Lyu Xi¹,Sun Yushan¹,Wang Lifeng²,Tan Jiehui¹,Zhang Liwen¹

Affiliation:

1. Science and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, China

2. Marine Design and Research Institute of China, Shanghai 200011, China

Abstract

This study aims to solve the problems of sparse reward, single policy, and poor environmental adaptability in the local motion planning task of autonomous underwater vehicles (AUVs). We propose a two-layer deep deterministic policy gradient algorithm-based end-to-end perception–planning–execution method to overcome the challenges associated with training and learning in end-to-end approaches that directly output control forces. In this approach, the state set is established based on the environment information, the action set is established based on the motion characteristics of the AUV, and the control execution force set is established based on the control constraints. The mapping relations between each set are trained using deep reinforcement learning, enabling the AUV to perform the corresponding action in the current state, thereby accomplishing tasks in an end-to-end manner. Furthermore, we introduce the hindsight experience replay (HER) method in the perception planning mapping process to enhance stability and sample efficiency during training. Finally, we conduct simulation experiments encompassing planning, execution, and end-to-end performance evaluation. Simulation training demonstrates that our proposed method exhibits improved decision-making capabilities and real-time obstacle avoidance during planning. Compared to global planning, the end-to-end algorithm comprehensively considers constraints in the AUV planning process, resulting in more realistic AUV actions that are gentler and more stable, leading to controlled tracking errors.

Funder

Natural Science Foundation of Heilongjiang Province of China

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Ocean Engineering,Water Science and Technology,Civil and Structural Engineering

Link

https://www.mdpi.com/2077-1312/11/9/1796/pdf

Reference35 articles.

1. A Search-based Path Planning Algorithm with Topological Constraints. Application to an AUV;Carreras;IFAC Proc. Vol.,2011

2. Carsten, J., Ferguson, D., and Stentz, A. (2006, January 9–15). 3D field D*: Improved path planning and replanning in three dimensions. Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China.

3. Garau, B., Alvarez, A., and Oliver, G. (2005, January 18–22). Path planning of autonomous underwater vehicles in current fields with complex spatial variability: An A* approach. Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA), Barcelona, Spain.

4. Obstacle avoidance in underwater glider path planning;Sosa;J. Phys. Agents,2012

5. Khatib, O. (1985, January 25–28). Real-time obstacle avoidance for manipulators and mobile robots. Proceedings of the 1985 IEEE International Conference on Robotics and Automation, St. Louis, MI, USA.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adaptive energy-efficient reinforcement learning for AUV 3D motion planning in complex underwater environments;Ocean Engineering;2024-11

2. An AUV collision avoidance algorithm in unknown environment with multiple constraints;Ocean Engineering;2024-02