Multi-Modality Image Fusion and Object Detection Based on Semantic Information-Reference-Cited by-同舟云学术

Multi-Modality Image Fusion and Object Detection Based on Semantic Information

Published:2023-04-26 Issue:5 Volume:25 Page:718
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Liu Yong¹,Zhou Xin²,Zhong Wei²

Affiliation:

1. School of Software Technology, Dalian University of Technology, Dalian 116620, China

2. International School of Information Science & Engineering, Dalian University of Technology, Dalian 116620, China

Abstract

Infrared and visible image fusion (IVIF) aims to provide informative images by combining complementary information from different sensors. Existing IVIF methods based on deep learning focus on strengthening the network with increasing depth but often ignore the importance of transmission characteristics, resulting in the degradation of important information. In addition, while many methods use various loss functions or fusion rules to retain complementary features of both modes, the fusion results often retain redundant or even invalid information.In order to accurately extract the effective information from both infrared images and visible light images without omission or redundancy, and to better serve downstream tasks such as target detection with the fused image, we propose a multi-level structure search attention fusion network based on semantic information guidance, which realizes the fusion of infrared and visible images in an end-to-end way. Our network has two main contributions: the use of neural architecture search (NAS) and the newly designed multilevel adaptive attention module (MAAB). These methods enable our network to retain the typical characteristics of the two modes while removing useless information for the detection task in the fusion results. In addition, our loss function and joint training method can establish a reliable relationship between the fusion network and subsequent detection tasks. Extensive experiments on the new dataset (M3FD) show that our fusion method has achieved advanced performance in both subjective and objective evaluations, and the mAP in the object detection task is improved by 0.5% compared to the second-best method (FusionGAN).

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/25/5/718/pdf

Reference62 articles.

1. An introduction to multisensor data fusion;Hall;Proc. IEEE,1997

2. A bilevel integrated model with data-driven layer ensemble for multi-modality image fusion;Liu;IEEE Trans. Image Process.,2020

3. Attention-guided global-local adversarial learning for detail-preserving multi-exposure image fusion;Liu;IEEE Trans. Circuits Syst. Video Technol.,2022

4. Bilevel modeling investigated generative adversarial framework for image restoration;Jiang;Vis. Comput.,2022

5. Ma, L., Ma, T., Liu, R., Fan, X., and Luo, Z. (2022, January 19–20). Toward Fast, Flexible, and Robust Low-Light Image Enhancement. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An effective reconstructed pyramid crosspoint fusion for multimodal infrared and visible images;Signal, Image and Video Processing;2024-06-21

2. 3D Object Detection under Urban Road Traffic Scenarios Based on Dual-Layer Voxel Features Fusion Augmentation;Sensors;2024-05-21

3. An Improved Hybrid Model for Target Detection;2023 International Conference on Emerging Techniques in Computational Intelligence (ICETCI);2023-09-21