Affiliation:
1. School of Computer Science, China University of Geosciences, Wuhan 430074, China
Abstract
Remote sensing image semantic segmentation plays a crucial role in various fields, such as environmental monitoring, urban planning, and agricultural land classification. However, most current research primarily focuses on utilizing the spatial and spectral information of single-temporal remote sensing images, neglecting the valuable temporal information present in historical image sequences. In fact, historical images often contain valuable phenological variations in land features, which exhibit diverse patterns and can significantly benefit from semantic segmentation tasks. This paper introduces a semantic segmentation framework for satellite image time series (SITS) based on dilated convolution and a Transformer encoder. The framework includes spatial encoding and temporal encoding. Spatial encoding, utilizing dilated convolutions exclusively, mitigates the loss of spatial accuracy and the need for up-sampling, while allowing for the extraction of rich multi-scale features through a combination of different dilation rates and dense connections. Temporal encoding leverages a Transformer encoder to extract temporal features for each pixel in the image. To better capture the annual periodic patterns of phenological phenomena in land features, position encoding is calculated based on the image’s acquisition date within the year. To assess the performance of this framework, comparative and ablation experiments were conducted using the PASTIS dataset. The experiments indicate that this framework achieves highly competitive performance with relatively low optimization parameters, resulting in an improvement of 8 percentage points in the mean Intersection over Union (mIoU).
Funder
National Natural Science Foundation of China Joint Fund Key Project
Open Research Project of The Hubei Key Laboratory of Intelligent Geo-Information Processing
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference40 articles.
1. Garnot, V.S.F., and Landrieu, L. (2021, January 11–17). Panoptic segmentation of satellite image time series with convolutional temporal attention networks. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada.
2. Abad, M.S.J., Abkar, A.A., and Mojaradi, B. (2018). Effect of the temporal gradient of vegetation indices on early-season wheat classification using the random forest classifier. Appl. Sci., 8.
3. Chen, Y., Li, M., and Zhang, Z. (2023). Does the Rural Land Transfer Promote the Non-Grain Production of Cultivated Land in China?. Land, 12.
4. Pluto-Kossakowska, J. (2021). Review on multitemporal classification methods of satellite images for crop and arable land recognition. Agriculture, 11.
5. Land use/land cover in view of earth observation: Data sources, input dimensions, and classifiers—A review of the state of the art;Pandey;Geocarto Int.,2021