CRTransSar: A Visual Transformer Based on Contextual Joint Representation Learning for SAR Ship Detection-Reference-Cited by-同舟云学术

CRTransSar: A Visual Transformer Based on Contextual Joint Representation Learning for SAR Ship Detection

Published:2022-03-19 Issue:6 Volume:14 Page:1488
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Xia Runfan,Chen Jie,Huang Zhixiang,Wan Huiyao,Wu Bocai,Sun Long,Yao Baidong,Xiang Haibing,Xing Mengdao

Abstract

Synthetic-aperture radar (SAR) image target detection is widely used in military, civilian and other fields. However, existing detection methods have low accuracy due to the limitations presented by the strong scattering of SAR image targets, unclear edge contour information, multiple scales, strong sparseness, background interference, and other characteristics. In response, for SAR target detection tasks, this paper combines the global contextual information perception of transformers and the local feature representation capabilities of convolutional neural networks (CNNs) to innovatively propose a visual transformer framework based on contextual joint-representation learning, referred to as CRTransSar. First, this paper introduces the latest Swin Transformer as the basic architecture. Next, it introduces the CNN’s local information capture and presents the design of a backbone, called CRbackbone, based on contextual joint representation learning, to extract richer contextual feature information while strengthening SAR target feature attributes. Furthermore, the design of a new cross-resolution attention-enhancement neck, called CAENeck, is presented to enhance the characterizability of multiscale SAR targets. The mAP of our method on the SSDD dataset attains 97.0% accuracy, reaching state-of-the-art levels. In addition, based on the HISEA-1 commercial SAR satellite, which has been launched into orbit and in whose development our research group participated, we released a larger-scale SAR multiclass target detection dataset, called SMCDD, which verifies the effectiveness of our method.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Anhui Province

China Postdoctoral Science Foundation

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/14/6/1488/pdf

Reference54 articles.

1. Ground Moving Target Imaging Based on Compressive Sensing Framework With Single-Channel SAR