Author:
Chen Yunfan,Li Yuting,Zheng Wenqi,Wan Xiangkui
Abstract
AbstractThe aftermath of a natural disaster leaves victims trapped in rubble which is challenging to detect by smart drones due to the victims in low visibility under the adverse disaster environments and victims in various sizes. To overcome the above challenges, a transformer fusion-based scale-aware attention network (TFSANet) is proposed to overcome adverse environmental impacts in disaster areas by robustly integrating the latent interactions between RGB and thermal images and to address the problem of various-sized victim detection. Firstly, a transformer fusion model is developed to incorporate a two-stream backbone network to effectively fuse the complementary characteristics between RGB and thermal images. This aims to solve the problem that the victims cannot be seen clearly due to the adverse disaster area, such as smog and heavy rain. In addition, a scale-aware attention mechanism is designed to be embedded into the head network to adaptively adjust the size of receptive fields aiming to capture victims with different scales. Extensive experiments on two challenging datasets indicate that our TFSANet achieves superior results. The proposed method achieves 86.56% average precision (AP) on the National Institute of Informatics—Chiba University (NII-CU) multispectral aerial person detection dataset, outperforming the state-of-the-art approach by 4.38%. On the drone-captured RGBT person detection (RGBTDronePerson) dataset, the proposed method significantly improves the AP of the state-of-the-art approach by 4.33%.
Funder
Wuhan Knowledge Innovation Project
Natural Science Foundation of Hubei Province
Publisher
Springer Science and Business Media LLC
Reference38 articles.
1. Arnold RD, Yamaguchi H, Tanaka T (2018) Search and rescue with autonomous flying robots through behavior-based cooperative intelligence. J Int Hum Act 3(1):1–18
2. Hwang S, Park J, Kim N, et al. (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045
3. Wagner J, Fischer V, Herman M et al (2016) Multispectral pedestrian detection using deep fusion convolutional neural networks. ESANN 587:509–514
4. Liu J, Zhang S, Wang S et al. (2016) Multispectral deep neural networks for pedestrian detection. arXiv preprint https://arXiv.org/1611.02644
5. Chen Y, Xie H, Shin H (2018) Multi-layer fusion techniques using a CNN for multispectral pedestrian detection. IET Comput Vision 12(8):1179–1187