NGLSFusion: Non-Use GPU Lightweight Indoor Semantic SLAM-Reference-Cited by-同舟云学术

NGLSFusion: Non-Use GPU Lightweight Indoor Semantic SLAM

Published:2023-04-23 Issue:9 Volume:13 Page:5285
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Wan Le¹²^ORCID,Jiang Lin¹³^ORCID,Tang Bo²³,Li Yunfei¹²,Lei Bin¹³,Liu Honghai⁴

Affiliation:

1. Key Education Laboratory of Ministry of Metallurgical Equipment and Control, Wuhan University of Science and Technology, Wuhan 430081, China

2. Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering, Wuhan University of Science and Technology, Wuhan 430081, China

3. Institute of Robotics and Intelligent Systems, Wuhan University of Science and Technology, Wuhan 430081, China

4. School of Mechanical and Electrical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, China

Abstract

Perception of the indoor environment is the basis of mobile robot localization, navigation, and path planning, and it is particularly important to construct semantic maps in real time using minimal resources. The existing methods are too dependent on the graphics processing unit (GPU) for acquiring semantic information about the indoor environment, and cannot build the semantic map in real time on the central processing unit (CPU). To address the above problems, this paper proposes a non-use GPU for lightweight indoor semantic map construction algorithm, named NGLSFusion. In the VO method, ORB features are used for the initialization of the first frame, new keyframes are created by optical flow method, and feature points are extracted by direct method, which speeds up the tracking speed. In the semantic map construction method, a pretrained model of the lightweight network LinkNet is optimized to provide semantic information in real time on devices with limited computing power, and a semantic point cloud is fused using OctoMap and Voxblox. Experimental results show that the algorithm in this paper ensures the accuracy of camera pose while speeding up the tracking speed, and obtains a reconstructed semantic map with complete structure without using GPU.

Funder

Lin Jiang

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/9/5285/pdf

Reference64 articles.

1. Schops, T., Schonberger, J.L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., and Geiger, A. (2017, January 21–26). A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.

2. Enqvist, O., Kahl, F., and Olsson, C. (2011, January 6–13). Non-Sequential Structure from Motion. Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain.

3. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age;Cadena;IEEE Trans. Robot.,2016

4. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., and Liang, J. (2018). Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer.

5. Nirkin, Y., Wolf, L., and Hassner, T. (2021, January 20–25). HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.