Affiliation:
1. School of Information Science and Engineering, Shenyang Ligong University, Shenyang 110159, China
Abstract
Road segmentation from high-resolution (HR) remote sensing images plays a core role in a wide range of applications. Due to the complex background of HR images, most of the current methods struggle to extract a road network correctly and completely. Furthermore, they suffer from either the loss of context information or high redundancy of details information. To alleviate these problems, we employ a dual branch dilated pyramid network (DPBFN), which enables dual-branch feature passing between two parallel paths when it is merged to a typical road extraction structure. A DPBFN consists of three parts: a residual multi-scaled dilated convolutional network branch, a transformer branch, and a fusion module. Constructing pyramid features through parallel multi-scale dilated convolution operations with multi-head attention block can enhance road features while suppressing redundant information. Both branches after fusing can solve shadow or vision occlusions and maintain the continuity of the road network, especially on a complex background. Experiments were carried out on three datasets of HR images to showcase the stable performance of the proposed method, and the results are compared with those of other methods. The OA in the three data sets of Massachusetts, Deep Globe, and GF-2 can reach more than 98.26%, 95.25%, and 95.66%, respectively, which has a significant improvement compared with the traditional CNN network. The results and explanation analysis via Grad-CAMs showcase the effective performance in accurately extracting road segments from a complex scene.
Funder
Liaoning Provincial Department of Education Youth Project
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference40 articles.
1. Spatial information inference net: Road extraction using road-specific contextual information;Tao;ISPRS J. Photogramm. Remote Sens.,2019
2. MSACon: Mining Spatial Attention-Based Contextual Information for Road Extraction;Xu;IEEE Trans. Geosci. Remote Sens.,2022
3. Attention is all you need;Vaswani;Adv. Neural Inf. Process. Syst.,2017
4. Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5–9). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany. Proceedings, Part III 18.
5. Segnet: A deep convolutional encoder-decoder architecture for image segmentation;Badrinarayanan;IEEE Trans. Pattern Anal. Mach. Intell.,2017