Mars Exploration: Research on Goal-Driven Hierarchical DQN Autonomous Scene Exploration Algorithm
-
Published:2024-08-22
Issue:8
Volume:11
Page:692
-
ISSN:2226-4310
-
Container-title:Aerospace
-
language:en
-
Short-container-title:Aerospace
Author:
Zhou Zhiguo1ORCID, Chen Ying1ORCID, Yu Jiabao1, Zu Bowen1, Wang Qian1, Zhou Xuehua1, Duan Junwei2
Affiliation:
1. School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China 2. Faculty of Data Science, City University of Macau, Macau, China
Abstract
In the non-deterministic, large-scale navigation environment under the Mars exploration mission, there is a large space for action and many environmental states. Traditional reinforcement learning algorithms that can only obtain rewards at target points and obstacles will encounter the problems of reward sparsity and dimension explosion, making the training speed too slow or even impossible. This work proposes a deep layered learning algorithm based on the goal-driven layered deep Q-network (GDH-DQN), which is more suitable for mobile robots to explore, navigate, and avoid obstacles without a map. The algorithm model is designed in two layers. The lower layer provides behavioral strategies to achieve short-term goals, and the upper layer provides selection strategies for multiple short-term goals. Use known position nodes as short-term goals to guide the mobile robot forward and achieve long-term obstacle avoidance goals. Hierarchical execution not only simplifies tasks but also effectively solves the problems of reward sparsity and dimensionality explosion. In addition, each layer of the algorithm integrates a Hindsight Experience Replay mechanism to improve performance, make full use of the goal-driven function of the node, and effectively avoid the possibility of misleading the agent by complex processes and reward function design blind spots. The agent adjusts the number of model layers according to the number of short-term goals, further improving the efficiency and adaptability of the algorithm. Experimental results show that, compared with the hierarchical DQN method, the navigation success rate of the GDH-DQN algorithm is significantly improved, and it is more suitable for unknown scenarios such as Mars exploration.
Reference56 articles.
1. Tao, Z., Zhang, W., Jia, Y., and Chen, B. (2022, January 25–27). Path Planning Technology of Mars Rover Based on Griding of Visibility-Graph Map Direction Search Method. Proceedings of the CAC, Xiamen, China. 2. Ropero, F., Muñoz, P., R-Moreno, M.D., and Barrero, D.F. (2017, January 27–29). A Virtual Reality Mission Planner for Mars Rovers. Proceedings of the 2017 6th International Conference on Space Mission Challenges for Information Technology (SMC-IT), Madrid, Spain. 3. Sun, S., Wang, L., Li, Z.P., Gu, P., Chen, F.F., and Feng, Y.T. (2020, January 13–15). Research on Parallel System for Motion States Monitoring of the Planetary Rover. Proceedings of the 2020 5th International Conference on Communication, Image and Signal Processing (CCISP), Chengdu, China. 4. In-Situ Resources for Infrastructure Construction on Mars: A Review;Liu;Int. J. Transp. Sci. Technol.,2022 5. The Mars 2020 Perseverance Rover Mast Camera Zoom (Mastcam-Z) Multispectral, Stereoscopic Imaging Investigation;Bell;Space Sci. Rev.,2021
|
|