Efficient greenhouse segmentation with visual foundation models: achieving more with fewer samples-Reference-Cited by-同舟云学术

Efficient greenhouse segmentation with visual foundation models: achieving more with fewer samples

Published:2024-07-30 Issue: Volume:12 Page:
ISSN:2296-665X
Container-title:Frontiers in Environmental Science
language:
Short-container-title:Front. Environ. Sci.

Author:

Lu Yuxiang,Wang Jiahe,Wang Dan,Liu Tang

Abstract

Introduction: The Vision Transformer (ViT) model, which leverages self-supervised learning, has shown exceptional performance in natural image segmentation, suggesting its extensive potential in visual tasks. However, its effectiveness diminishes in remote sensing due to the varying perspectives of remote sensing images and unique optical properties of features like the translucency of greenhouses. Additionally, the high cost of training Visual Foundation Models (VFMs) from scratch for specific scenes limits their deployment.Methods: This study investigates the feasibility of rapidly deploying VFMs on new tasks by using embedding vectors generated by VFMs as prior knowledge to enhance traditional segmentation models’ performance. We implemented this approach to improve the accuracy and robustness of segmentation with the same number of trainable parameters. Comparative experiments were conducted to evaluate the efficiency and effectiveness of this method, especially in the context of greenhouse detection and management.Results: Our findings indicate that the use of embedding vectors facilitates rapid convergence and significantly boosts segmentation accuracy and robustness. Notably, our method achieves or exceeds the performance of traditional segmentation models using only about 40% of the annotated samples. This reduction in the reliance on manual annotation has significant implications for remote sensing applications.Discussion: The application of VFMs in remote sensing tasks, particularly for greenhouse detection and management, demonstrated enhanced segmentation accuracy and reduced dependence on annotated samples. This method adapts more swiftly to different lighting conditions, enabling more precise monitoring of agricultural resources. Our study underscores the potential of VFMs in remote sensing tasks and opens new avenues for the expansive application of these models in diverse downstream tasks.

Publisher

Frontiers Media SA

Reference36 articles.

1. Hyperspectral remote sensing data analysis and future challenges;Bioucas-Dias;IEEE Geoscience remote Sens. Mag.,2013

2. Language models are few-shot learners;Brown;Adv. neural Inf. Process. Syst.,2020

3. Time travelling pixels: bitemporal features integration with foundation model for remote sensing image change detection;Chen,2023

4. Rethinking atrous convolution for semantic image segmentation;Chen,2017

5. A simple framework for contrastive learning of visual representations;Chen,2020