PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont-Conv Fusion Module-Reference-Cited by-同舟云学术

PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont-Conv Fusion Module

Published:2020-04-03 Issue:07 Volume:34 Page:12460-12467
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Xie Liang,Xiang Chao,Yu Zhengxu,Xu Guodong,Yang Zheng,Cai Deng,He Xiaofei

Abstract

LIDAR point clouds and RGB-images are both extremely essential for 3D object detection. So many state-of-the-art 3D detection algorithms dedicate in fusing these two types of data effectively. However, their fusion methods based on Bird's Eye View (BEV) or voxel format are not accurate. In this paper, we propose a novel fusion approach named Point-based Attentive Cont-conv Fusion(PACF) module, which fuses multi-sensor features directly on 3D points. Except for continuous convolution, we additionally add a Point-Pooling and an Attentive Aggregation to make the fused features more expressive. Moreover, based on the PACF module, we propose a 3D multi-sensor multi-task network called Pointcloud-Image RCNN(PI-RCNN as brief), which handles the image segmentation and 3D object detection tasks. PI-RCNN employs a segmentation sub-network to extract full-resolution semantic feature maps from images and then fuses the multi-sensor features via powerful PACF module. Beneficial from the effectiveness of the PACF module and the expressive semantic features from the segmentation module, PI-RCNN can improve much in 3D object detection. We demonstrate the effectiveness of the PACF module and PI-RCNN on the KITTI 3D Detection benchmark, and our method can achieve state-of-the-art on the metric of 3D AP.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 110 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. VIDF-Net: A Voxel-Image Dynamic Fusion method for 3D object detection;Computer Vision and Image Understanding;2024-12

2. C2BG-Net: Cross-modality and cross-scale balance network with global semantics for multi-modal 3D object detection;Neural Networks;2024-11

3. DenseSphere: Multimodal 3D object detection under a sparse point cloud based on spherical coordinate;Expert Systems with Applications;2024-10

4. Three-Dimensional Object Detection Network Based on Multi-Layer and Multi-Modal Fusion;Electronics;2024-09-04

5. VoPiFNet: Voxel-Pixel Fusion Network for Multi-Class 3D Object Detection;IEEE Transactions on Intelligent Transportation Systems;2024-08