Abstract
Abstract
Semantic segmentation of remote sensing urban scene images has diverse practical applications, including land cover mapping, urban change detection, environmental protection, and economic evaluation. However, classical semantic segmentation networks encounter challenges such as inadequate utilization of multi-scale semantic information and imprecise edge target segmentation in high-resolution remote sensing images. In response, this article introduces an efficient multi-scale network (EMNet) tailored for semantic segmentation of common features in remote sensing images. To address these challenges, EMNet integrates several key components. Firstly, the efficient atrous spatial pyramid pooling module is employed to enhance the relevance of multi-scale targets, facilitating improved extraction and processing of context information across different scales. Secondly, the efficient multi-scale attention mechanism and multi-scale jump connections are utilized to fuse semantic features from various levels, thereby achieving precise segmentation boundaries and accurate position information. Finally, an encoder-decoder structure is incorporated to refine the segmentation results. The effectiveness of the proposed network is validated through experiments conducted on the publicly available DroneDeploy image dataset and Potsdam dataset. Results indicate that EMNet achieves impressive performance metrics, with mean intersection over union (MIoU), mean precision (MPrecision), and mean recall (MRecall) reaching 75.99%, 86.76%, and 85.07%, respectively. Comparative analysis demonstrates that the network proposed in this article outperforms current mainstream semantic segmentation networks on both the DroneDeploy and Potsdam dataset.
Funder
Inner Mongolia Natural Science Foundation
National Natural Science Foundation of China