TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection-Reference-Cited by-同舟云学术

TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection

Published:2022-07-05 Issue:13 Volume:10 Page:2354
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Ju Xiaochen,Zhao Xinxin,Qian Shengsheng

Abstract

Cracks are widespread in infrastructure that are closely related to human activity. It is very popular to use artificial intelligence to detect cracks intelligently, which is known as crack detection. The noise in the background of crack images, discontinuity of cracks and other problems make the crack detection task a huge challenge. Although many approaches have been proposed, there are still two challenges: (1) cracks are long and complex in shape, making it difficult to capture long-range continuity; (2) most of the images in the crack dataset have noise, and it is difficult to detect only the cracks and ignore the noise. In this paper, we propose a novel method called Transformer-based Multi-scale Fusion Model (TransMF) for crack detection, including an Encoder Module (EM), Decoder Module (DM) and Fusion Module (FM). The Encoder Module uses a hybrid of convolution blocks and Swin Transformer block to model the long-range dependencies of different parts in a crack image from a local and global perspective. The Decoder Module is designed with symmetrical structure to the Encoder Module. In the Fusion Module, the output in each layer with unique scales of Encoder Module and Decoder Module are fused in the form of convolution, which can release the effect of background noise and strengthen the correlations between relevant context in order to enhance the crack detection. Finally, the output of each layer of the Fusion Module is concatenated to achieve the purpose of crack detection. Extensive experiments on three benchmark datasets (CrackLS315, CRKWH100 and DeepCrack) demonstrate that the proposed TransMF in this paper exceeds the best performance of present baselines.

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/10/13/2354/pdf

Reference50 articles.

1. DeepFace: Closing the Gap to Human-Level Performance in Face Verification;Taigman;Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition,2014

2. License Plate Detection and Recognition in Unconstrained Scenarios;Silva;Proceedings of the European Conference on Computer Vision,2018

3. pDisVPL: Probabilistic Discriminative Visual Part Learning for Image Classification

4. Robust geometric ℓ p -norm feature pooling for image classification and action recognition

5. On random hyper-class random forest for visual classification

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. FCT-Net: A dual-encoding-path network fusing atrous spatial pyramid pooling and transformer for pavement crack detection;Engineering Applications of Artificial Intelligence;2024-11

2. Deep learning for automated multiclass surface damage detection in bridge inspections;Automation in Construction;2024-10

3. DefNet: A multi-scale dual-encoding fusion network aggregating Transformer and CNN for crack segmentation;Construction and Building Materials;2024-10

4. Crack-SAM: Crack Segmentation Using a Foundation Model;2024-08-21

5. Bridge crack detection method based on multi-scale feature fusion;International Conference on Algorithms, Software Engineering, and Network Security;2024-04-26