3D Point Cloud Object Detection Algorithm Based on Temporal Information Fusion and Uncertainty Estimation
-
Published:2023-06-08
Issue:12
Volume:15
Page:2986
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Xie Guangda1, Li Yang2, Wang Yanping2, Li Ziyi1, Qu Hongquan2
Affiliation:
1. College of Electrical and Control Engineering, North China University of Technology, Beijing 100144, China 2. College of Information, North China University of Technology, Beijing 100144, China
Abstract
In autonomous driving, LiDAR (light detection and ranging) data are acquired over time. Most existing 3D object detection algorithms propose the object bounding box by processing each frame of data independently, which ignores the temporal sequence information. However, the temporal sequence information is usually helpful to detect the object with missing shape information due to long distance or occlusion. To address this problem, we propose a temporal sequence information fusion 3D point cloud object detection algorithm based on the Ada-GRU (adaptive gated recurrent unit). In this method, the feature of each frame for the LiDAR point cloud is extracted through the backbone network and is fed to the Ada-GRU together with the hidden features of the previous frames. Compared to the traditional GRU, the Ada-GRU can adjust the gating mechanism adaptively during the training process by introducing the adaptive activation function. The Ada-GRU outputs the temporal sequence fusion features to predict the 3D object in the current frame and transmits the hidden features of the current frame to the next frame. At the same time, the label uncertainty of the distant and occluded objects affects the training effect of the model. For this problem, this paper proposes a probability distribution model of 3D bounding box coordinates based on the Gaussian distribution function and designs the corresponding bounding box loss function to enable the model to learn and estimate the uncertainty of the positioning of the bounding box coordinates, so as to remove the bounding box with large positioning uncertainty in the post-processing stage to reduce the false positive rate. Finally, the experiments show that the methods proposed in this paper improve the accuracy of the object detection without significantly increasing the complexity of the algorithm.
Funder
National Natural Science Foundation of China Beijing Municipal Commission of Education North China University of Technology
Subject
General Earth and Planetary Sciences
Reference37 articles.
1. Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., Guo, J., Zhou, Y., Chai, Y., and Caine, B. (2019). Scalability in perception for autonomous driving: Waymo open dataset. arXiv. 2. Caesar, H., Bankiti, V., Lang, A.H., Vora, S., and Beijbom, O. (2020, January 13–19). Nuscenes: A multimodal dataset for autonomous driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA. 3. Cho, K., Merrienboer, B.V., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014, January 25–29). Learning phrase representations using rnn ncoder-decoder for statistical machine translation. Proceedings of the Conference on Empirical Metods in Natural Language Processing (EMNLP), Doha, Qatar. 4. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks;Jagtap;J. Comput. Phys.,2020 5. Peiyun, H., Jason, Z., David, H., and Deva, R. (2020, January 13–19). What you see is what you get: Exploiting visibility for 3d object detection. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Research on Point Cloud Upsampling Technologies;Journal of Image and Signal Processing;2024
|
|