Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving-Reference-Cited by-同舟云学术

Instance Segmentation Frustum–PointPillars: A Lightweight Fusion Algorithm for Camera–LiDAR Perception in Autonomous Driving

Published:2024-01-03 Issue:1 Volume:12 Page:153
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Wang Yongsheng¹^ORCID,Han Xiaobo²^ORCID,Wei Xiaoxu³^ORCID,Luo Jie²

Affiliation:

1. School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China

2. School of Automation, Wuhan University of Technology, Wuhan 430070, China

3. School of Automotive Engineering, Wuhan University of Technology, Wuhan 430070, China

Abstract

The fusion of camera and LiDAR perception has become a research focal point in the autonomous driving field. Existing image–point cloud fusion algorithms are overly complex, and processing large amounts of 3D LiDAR point cloud data requires high computational power, which poses challenges for practical applications. To overcome the above problems, herein, we propose an Instance Segmentation Frustum (ISF)–PointPillars method. Within the framework of our method, input data are derived from both a camera and LiDAR. RGB images are processed using an enhanced 2D object detection network based on YOLOv8, thereby yielding rectangular bounding boxes and edge contours of the objects present within the scenes. Subsequently, the rectangular boxes are extended into 3D space as frustums, and the 3D points located outside them are removed. Afterward, the 2D edge contours are also extended to frustums to filter the remaining points from the preceding stage. Finally, the retained points are sent to our improved 3D object detection network based on PointPillars, and this network infers crucial information, such as object category, scale, and spatial position. In pursuit of a lightweight model, we incorporate attention modules into the 2D detector, thereby refining the focus on essential features, minimizing redundant computations, and enhancing model accuracy and efficiency. Moreover, the point filtering algorithm substantially diminishes the volume of point cloud data while concurrently reducing their dimensionality, thereby ultimately achieving lightweight 3D data. Through comparative experiments on the KITTI dataset, our method outperforms traditional approaches, achieving an average precision (AP) of 88.94% and bird’s-eye view (BEV) accuracy of 90.89% in car detection.

Funder

Key R&D Program Project in Hubei Province, China: Research on Key Technologies of Robot Collaboration

Publisher

MDPI AG

Link

https://www.mdpi.com/2227-7390/12/1/153/pdf

Reference47 articles.

1. Fayyad, J., Jaradat, M.A., Gruyer, D., and Najjaran, H. (2020). Deep learning sensor fusion for autonomous vehicle perception and localization: A review. Sensors, 20.

2. Yeong, D.J., Velasco-Hernandez, G., Barry, J., and Walsh, J. (2021). Sensor and sensor fusion technology in autonomous vehicles: A review. Sensors, 21.

3. A survey on instance segmentation: State of the art;Hafiz;Int. J. Multimed. Inf. Retr.,2020

4. A review on 2d instance segmentation based on deep neural networks;Gu;Image Vis. Comput.,2022

5. He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017, January 22–29). Mask r-CNN. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. FPGA Implementation of Pillar-Based Object Classification for Autonomous Mobile Robot;Electronics;2024-08-01

2. Implementation of MIMO Radar-Based Point Cloud Images for Environmental Recognition of Unmanned Vehicles and Its Application;Remote Sensing;2024-05-14