Abstract
Currently, an increasing number of convolutional neural networks (CNNs) focus specifically on capturing contextual features (con. feat) to improve performance in semantic segmentation tasks. However, high-level con. feat are biased towards encoding features of large objects, disregard spatial details, and have a limited capacity to discriminate between easily confused classes (e.g., trees and grasses). As a result, we incorporate low-level features (low. feat) and class-specific discriminative features (dis. feat) to boost model performance further, with low. feat helping the model in recovering spatial information and dis. feat effectively reducing class confusion during segmentation. To this end, we propose a novel deep multi-feature learning framework for the semantic segmentation of VHR RSIs, dubbed MFNet. The proposed MFNet adopts a multi-feature learning mechanism to learn more complete features, including con. feat, low. feat, and dis. feat. More specifically, aside from a widely used context aggregation module for capturing con. feat, we additionally append two branches for learning low. feat and dis. feat. One focuses on learning low. feat at a shallow layer in the backbone network through local contrast processing, while the other groups con. feat and then optimizes each class individually to generate dis. feat with better inter-class discriminative capability. Extensive quantitative and qualitative evaluations demonstrate that the proposed MFNet outperforms most state-of-the-art models on the ISPRS Vaihingen and Potsdam datasets. In particular, thanks to the mechanism of multi-feature learning, our model achieves an overall accuracy score of 91.91% on the Potsdam test set with VGG16 as a backbone, performing favorably against advanced models with ResNet101.
Subject
General Earth and Planetary Sciences
Reference54 articles.
1. Rethinking atrous convolution for semantic image segmentation;Chen;arXiv,2017
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献