Robust BEV 3D Object Detection for Vehicles with Tire Blow-Out-Reference-Cited by-同舟云学术

Robust BEV 3D Object Detection for Vehicles with Tire Blow-Out

Published:2024-07-09 Issue:14 Volume:24 Page:4446
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Yang Dongsheng¹,Fan Xiaojie¹,Dong Wei¹,Huang Chaosheng²,Li Jun²

Affiliation:

1. The BYD Auto Industry Company Limited, Shenzhen 518000, China

2. School of Vehicle and Mobility, Tsinghua University, Beijing 100084, China

Abstract

The bird’s-eye view (BEV) method, which is a vision-centric representation-based perception task, is essential and promising for future Autonomous Vehicle perception. It has advantages of fusion-friendly, intuitive, end-to-end optimization and is cheaper than LiDAR. The performance of existing BEV methods, however, would be deteriorated under the situation of a tire blow-out. This is because they quite rely on accurate camera calibration which may be disabled by noisy camera parameters during blow-out. Therefore, it is extremely unsafe to use existing BEV methods in the tire blow-out situation. In this paper, we propose a geometry-guided auto-resizable kernel transformer (GARKT) method, which is designed especially for vehicles with tire blow-out. Specifically, we establish a camera deviation model for vehicles with tire blow-out. Then we use the geometric priors to attain the prior position in perspective view with auto-resizable kernels. The resizable perception areas are encoded and flattened to generate BEV representation. GARKT predicts the nuScenes detection score (NDS) with a value of 0.439 on a newly created blow-out dataset based on nuScenes. NDS can still obtain 0.431 when the tire is completely flat, which is much more robust compared to other transformer-based BEV methods. Moreover, the GARKT method has almost real-time computing speed, with about 20.5 fps on one GPU.

Funder

Research on the Mechanical Load Response Mechanism

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/14/4446/pdf

Reference38 articles.

1. Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017, January 21–26). Multi-view 3d object detection network for autonomous driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.

2. Lang, A., Vora, S., Caesar, H., Zhou, L., Yang, J., and Beijbom, O. (2019, January 15–20). Pointpillars: Fast encoders for object detection from point clouds. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.

3. Bewley, A., Sun, P., Mensink, T., Anguelov, D., and Sminchisescu, C. (2020). Range conditioned dilated convolutions for scale invariant 3d object detection. arXiv.

4. Ma, X., Wang, Z., Li, H., Zhang, P., Ouyang, W., and Fan, X. (November, January 27). Accurate monocular 3d object detection via color-embedded 3d reconstruction for autonomous driving. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.

5. Zhang, R., Qiu, H., Wang, T., Xu, X., Guo, Z., Qiao, Y., Gao, P., and Li, H. (2022). MonoDETR: Depth-aware transformer for monocular 3d object detection. arXiv.