Transformer fusion-based scale-aware attention network for multispectral victim detection-Reference-Cited by-同舟云学术

Transformer fusion-based scale-aware attention network for multispectral victim detection

Published:2024-06-16 Issue:5 Volume:10 Page:6619-6632
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Chen Yunfan,Li Yuting,Zheng Wenqi,Wan Xiangkui

Abstract

AbstractThe aftermath of a natural disaster leaves victims trapped in rubble which is challenging to detect by smart drones due to the victims in low visibility under the adverse disaster environments and victims in various sizes. To overcome the above challenges, a transformer fusion-based scale-aware attention network (TFSANet) is proposed to overcome adverse environmental impacts in disaster areas by robustly integrating the latent interactions between RGB and thermal images and to address the problem of various-sized victim detection. Firstly, a transformer fusion model is developed to incorporate a two-stream backbone network to effectively fuse the complementary characteristics between RGB and thermal images. This aims to solve the problem that the victims cannot be seen clearly due to the adverse disaster area, such as smog and heavy rain. In addition, a scale-aware attention mechanism is designed to be embedded into the head network to adaptively adjust the size of receptive fields aiming to capture victims with different scales. Extensive experiments on two challenging datasets indicate that our TFSANet achieves superior results. The proposed method achieves 86.56% average precision (AP) on the National Institute of Informatics—Chiba University (NII-CU) multispectral aerial person detection dataset, outperforming the state-of-the-art approach by 4.38%. On the drone-captured RGBT person detection (RGBTDronePerson) dataset, the proposed method significantly improves the AP of the state-of-the-art approach by 4.33%.

Funder

Wuhan Knowledge Innovation Project

Natural Science Foundation of Hubei Province

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s40747-024-01515-y.pdf

Reference38 articles.

1. Arnold RD, Yamaguchi H, Tanaka T (2018) Search and rescue with autonomous flying robots through behavior-based cooperative intelligence. J Int Hum Act 3(1):1–18

2. Hwang S, Park J, Kim N, et al. (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045

3. Wagner J, Fischer V, Herman M et al (2016) Multispectral pedestrian detection using deep fusion convolutional neural networks. ESANN 587:509–514

4. Liu J, Zhang S, Wang S et al. (2016) Multispectral deep neural networks for pedestrian detection. arXiv preprint https://arXiv.org/1611.02644

5. Chen Y, Xie H, Shin H (2018) Multi-layer fusion techniques using a CNN for multispectral pedestrian detection. IET Comput Vision 12(8):1179–1187