Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network-Reference-Cited by-同舟云学术

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

Published:2019-12-11 Issue:24 Volume:11 Page:2970
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Ye Ziran^ORCID,Fu Yongyong^ORCID,Gan Muye,Deng Jinsong,Comber Alexis^ORCID,Wang Ke

Abstract

Automated methods to extract buildings from very high resolution (VHR) remote sensing data have many applications in a wide range of fields. Many convolutional neural network (CNN) based methods have been proposed and have achieved significant advances in the building extraction task. In order to refine predictions, a lot of recent approaches fuse features from earlier layers of CNNs to introduce abundant spatial information, which is known as skip connection. However, this strategy of reusing earlier features directly without processing could reduce the performance of the network. To address this problem, we propose a novel fully convolutional network (FCN) that adopts attention based re-weighting to extract buildings from aerial imagery. Specifically, we consider the semantic gap between features from different stages and leverage the attention mechanism to bridge the gap prior to the fusion of features. The inferred attention weights along spatial and channel-wise dimensions make the low level feature maps adaptive to high level feature maps in a target-oriented manner. Experimental results on three publicly available aerial imagery datasets show that the proposed model (RFA-UNet) achieves comparable and improved performance compared to other state-of-the-art models for building extraction.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Zhejiang Province

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/11/24/2970/pdf

Reference68 articles.

1. Use of shadows for detection of earthquake-induced collapsed buildings in high-resolution satellite imagery

2. Decision Fusion With Multiple Spatial Supports by Conditional Random Fields