Semantic Segmentation and Depth Estimation Based on Residual Attention Mechanism
Author:
Ji Naihua1, Dong Huiqian1, Meng Fanyun1, Pang Liping2
Affiliation:
1. School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266033, China 2. School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
Abstract
Semantic segmentation and depth estimation are crucial components in the field of autonomous driving for scene understanding. Jointly learning these tasks can lead to a better understanding of scenarios. However, using task-specific networks to extract global features from task-shared networks can be inadequate. To address this issue, we propose a multi-task residual attention network (MTRAN) that consists of a global shared network and two attention networks dedicated to semantic segmentation and depth estimation. The convolutional block attention module is used to highlight the global feature map, and residual connections are added to prevent network degradation problems. To ensure manageable task loss and prevent specific tasks from dominating the training process, we introduce a random-weighted strategy into the impartial multi-task learning method. We conduct experiments to demonstrate the effectiveness of the proposed method.
Funder
Project of Huzhou Science and Technology High-level Talents Innovation Support Program of Dalian
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference38 articles.
1. Zhang, D., Zheng, Z., Wang, T., and He, Y. (2020). HROM: Learning high-resolution representation and object-aware masks for visual object tracking. Sensors, 20. 2. Abdulwahab, S., Rashwan, H.A., Sharaf, N., Khalid, S., and Puig, D. (2023). Deep Monocular Depth Estimation Based on Content and Contextual Features. Sensors, 23. 3. Zhang, Q., Chen, L., Shao, M., Liang, H., and Ren, J. (2023). ESAMask: Real-Time Instance Segmentation Fused with Efficient Sparse Attention. Sensors, 23. 4. Monocular depth estimation based on deep learning: An overview;Zhao;Sci. China Technol. Sci.,2020 5. Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., and Agrawal, A. (2018, January 18–23). Context encoding for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|