DFA-SAT: Dynamic Feature Abstraction with Self-Attention-Based 3D Object Detection for Autonomous Driving-Reference-Cited by-同舟云学术

DFA-SAT: Dynamic Feature Abstraction with Self-Attention-Based 3D Object Detection for Autonomous Driving

Published:2023-09-13 Issue:18 Volume:15 Page:13667
ISSN:2071-1050
Container-title:Sustainability
language:en
Short-container-title:Sustainability

Author:

Mushtaq Husnain¹,Deng Xiaoheng¹^ORCID,Ali Mubashir²^ORCID,Hayat Babur³,Raza Sherazi Hafiz Husnain⁴^ORCID

Affiliation:

1. School of Computer Science and Engineering, Central South University, Changsha 410083, China

2. School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK

3. Department of Computer Science, University of Chenab, Gujrat 50700, Pakistan

4. School of Computing and Engineering, University of West London, London W5 5RF, UK

Abstract

Autonomous vehicles (AVs) play a crucial role in enhancing urban mobility within the context of a smarter and more connected urban environment. Three-dimensional object detection in AVs is an essential task for comprehending the driving environment to contribute to their safe use in urban environments. Existing 3D LiDAR object detection systems lose many critical point features during the down-sampling process and neglect the crucial interactions between local features, providing insufficient semantic information and leading to subpar detection performance. We propose a dynamic feature abstraction with self-attention (DFA-SAT), which utilizes self-attention to learn semantic features with contextual information by incorporating neighboring data and focusing on vital geometric details. DFA-SAT comprises four modules: object-based down-sampling (OBDS), semantic and contextual feature extraction (SCFE), multi-level feature re-weighting (MLFR), and local and global features aggregation (LGFA). The OBDS module preserves the maximum number of semantic foreground points along with their spatial information. SCFE learns rich semantic and contextual information with respect to spatial dependencies, refining the point features. MLFR decodes all the point features using a channel-wise multi-layered transformer approach. LGFA combines local features with decoding weights for global features using matrix product keys and query embeddings to learn spatial information across each channel. Extensive experiments using the KITTI dataset demonstrate significant improvements over the mainstream methods SECOND and PointPillars, improving the mean average precision (AP) by 6.86% and 6.43%, respectively, on the KITTI test dataset. DFA-SAT yields better and more stable performance for medium and long distances with a limited impact on real-time performance and model parameters, ensuring a transformative shift akin to when automobiles replaced conventional transportation in cities.

Publisher

MDPI AG

Subject

Management, Monitoring, Policy and Law,Renewable Energy, Sustainability and the Environment,Geography, Planning and Development,Building and Construction

Link

https://www.mdpi.com/2071-1050/15/18/13667/pdf

Reference68 articles.

1. Mitieka, D., Luke, R., Twinomurinzi, H., and Mageto, J. (2023). Smart Mobility in Urban Areas: A Bibliometric Review and Research Agenda. Sustainability, 15.

2. Shi, H., Hou, D., and Li, X. (2023). Center-Aware 3D Object Detection with Attention Mechanism Based on Roadside LiDAR. Sustainability, 15.

3. Lee, H.K. (2022). The Relationship between Innovative Technology and Driver’s Resistance and Acceptance Intention for Sustainable Use of Automobile Self-Driving System. Sustainability, 14.

4. Zhang, D., Li, Y., Li, Y., and Shen, Z. (2022). Service Failure Risk Assessment and Service Improvement of Self-Service Electric Vehicle. Sustainability, 14.

5. Xia, T., Lin, X., Sun, Y., and Liu, T. (2023). An Empirical Study of the Factors Influencing Users’ Intention to Use Automotive AR-HUD. Sustainability, 15.