DPACFuse: Dual-Branch Progressive Learning for Infrared and Visible Image Fusion with Complementary Self-Attention and Convolution
Author:
Zhu Huayi1ORCID, Wu Heshan1ORCID, Wang Xiaolong1, He Dongmei1, Liu Zhenbing2, Pan Xipeng1ORCID
Affiliation:
1. School of Computer and Information Security, Guilin University of Electronic Science and Technology, Guilin 541004, China 2. School of Artificial Intelligence, Guilin University of Electronic Science and Technology, Guilin 541004, China
Abstract
Infrared and visible image fusion aims to generate a single fused image that not only contains rich texture details and salient objects, but also facilitates downstream tasks. However, existing works mainly focus on learning different modality-specific or shared features, and ignore the importance of modeling cross-modality features. To address these challenges, we propose Dual-branch Progressive learning for infrared and visible image fusion with a complementary self-Attention and Convolution (DPACFuse) network. On the one hand, we propose Cross-Modality Feature Extraction (CMEF) to enhance information interaction and the extraction of common features across modalities. In addition, we introduce a high-frequency gradient convolution operation to extract fine-grained information and suppress high-frequency information loss. On the other hand, to alleviate the CNN issues of insufficient global information extraction and computation overheads of self-attention, we introduce the ACmix, which can fully extract local and global information in the source image with a smaller computational overhead than pure convolution or pure self-attention. Extensive experiments demonstrated that the fused images generated by DPACFuse not only contain rich texture information, but can also effectively highlight salient objects. Additionally, our method achieved approximately 3% improvement over the state-of-the-art methods in MI, Qabf, SF, and AG evaluation indicators. More importantly, our fused images enhanced object detection and semantic segmentation by approximately 10%, compared to using infrared and visible images separately.
Funder
National Natural Science Foundation of China Guangxi Natural Science Foundation university student innovation training program project
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference54 articles.
1. Deep learning-based image fusion: A survey;Tang;J. Image Graph.,2023 2. Wang, J., Liu, A., Yin, Z., Liu, S., Tang, S., and Liu, X. (2021). Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World. arXiv. 3. Training Robust Deep Neural Networks via Adversarial Noise Propagation;Liu;IEEE Trans. Image Process.,2021 4. Zeng, Y., Zhang, D., Wang, C., Miao, Z., Liu, T., Zhan, X., Hao, D., and Ma, C. (2022, January 18–24). LIFT: Learning 4D LiDAR Image Fusion Transformer for 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA. 5. SMILE: Cost-sensitive multi-task learning for nuclear segmentation and classification with imbalanced annotations;Pan;Med. Image Anal.,2023
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|