Deep Residual Autoencoder with Multiscaling for Semantic Segmentation of Land-Use Images-Reference-Cited by-同舟云学术

Deep Residual Autoencoder with Multiscaling for Semantic Segmentation of Land-Use Images

Published:2019-09-14 Issue:18 Volume:11 Page:2142
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Li Lianfa^ORCID

Abstract

Semantic segmentation is a fundamental means of extracting information from remotely sensed images at the pixel level. Deep learning has enabled considerable improvements in efficiency and accuracy of semantic segmentation of general images. Typical models range from benchmarks such as fully convolutional networks, U-Net, Micro-Net, and dilated residual networks to the more recently developed DeepLab 3+. However, many of these models were originally developed for segmentation of general or medical images and videos, and are not directly relevant to remotely sensed images. The studies of deep learning for semantic segmentation of remotely sensed images are limited. This paper presents a novel flexible autoencoder-based architecture of deep learning that makes extensive use of residual learning and multiscaling for robust semantic segmentation of remotely sensed land-use images. In this architecture, a deep residual autoencoder is generalized to a fully convolutional network in which residual connections are implemented within and between all encoding and decoding layers. Compared with the concatenated shortcuts in U-Net, these residual connections reduce the number of trainable parameters and improve the learning efficiency by enabling extensive backpropagation of errors. In addition, resizing or atrous spatial pyramid pooling (ASPP) can be leveraged to capture multiscale information from the input images to enhance the robustness to scale variations. The residual learning and multiscaling strategies improve the trained model’s generalizability, as demonstrated in the semantic segmentation of land-use types in two real-world datasets of remotely sensed images. Compared with U-Net, the proposed method improves the Jaccard index (JI) or the mean intersection over union (MIoU) by 4-11% in the training phase and by 3-9% in the validation and testing phases. With its flexible deep learning architecture, the proposed approach can be easily applied for and transferred to semantic segmentation of land-use variables and other surface variables of remotely sensed images.

Funder

the Chinese Academy of Sciences

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/11/18/2142/pdf

Reference89 articles.

1. Integrating visual cues for object segmentation and recognition

2. Deep learning in remote sensing applications: A meta-analysis and review