Abstract
In this paper, we present a two stages solution to 3D vehicle detection and segmentation. The first stage depends on the combination of EfficientNetB3 architecture with multiparallel residual blocks (inspired by CenterNet architecture) for 3D localization and poses estimation for vehicles on the scene. The second stage takes the output of the first stage as input (cropped car images) to train EfficientNet B3 for the image recognition task. Using predefined 3D Models, we substitute each vehicle on the scene with its match using the rotation matrix and translation vector from the first stage to get the 3D detection bounding boxes and segmentation masks. We trained our models on an open-source dataset (ApolloCar3D). Our method outperforms all published solutions in terms of 6 degrees of freedom error (6 DoF err).
Funder
Russian Science Foundation
Russian State Research
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference24 articles.
1. Monocular 3D Localization of Vehicles in Road Scenes
2. Segment2Regress: Monocular 3D Vehicle Localization in Two Stages;Jaesung;Proceedings of the Robotics: Science and Systems (RSS),2019
3. Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving
4. VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion
5. DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization;Su;arXiv,2021
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献