Affiliation:
1. Pattern Recognition and Computer Vision Lab, Zhejiang Sci-Tech University, Hangzhou 310018, China
2. Key Laboratory of Digital Design and Intelligent Manufacture in Culture & Creativity Product of Zhejiang Province, Lishui University, Lishui 323000, China
Abstract
Scene classification in remote sensing is a pivotal research area, traditionally relying on visual information from aerial images for labeling. The introduction of ground environment audio as a novel geospatial data source adds valuable information for scene classification. However, bridging the structural gap between aerial images and ground environment audio is challenging, rendering popular two-branch networks ineffective for direct data fusion. To address this issue, the study in this research presents the Two-stage Fusion-based Audiovisual Classification Network (TFAVCNet). TFAVCNet leverages both audio and visual modules to extract deep semantic features from ground environmental audio and remote sensing images, respectively. The audiovisual fusion module combines and fuses information from both modalities at the feature and decision levels, facilitating joint training and yielding a more-robust solution. The proposed method outperforms existing approaches, as demonstrated by the experimental results on the ADVANCE dataset for remote sensing audiovisual scene classification, offering an innovative approach to enhanced scene classification.
Funder
Natural Science Foundation of Zhejiang Province
National Natural Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference69 articles.
1. Scene classification of high-resolution remotely sensed image based on ResNet;Wang;J. Geovis. Spat. Anal.,2019
2. Satellite and scene image classification based on transfer learning and fine tuning of ResNet50;Shabbir;Math. Probl. Eng.,2021
3. Graph-Embedding Balanced Transfer Subspace Learning for Hyperspectral Cross-Scene Classification;Zhou;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2022
4. Chen, L., Cui, X., Li, Z., Yuan, Z., Xing, J., Xing, X., and Jia, Z. (2019). A new deep learning algorithm for SAR scene classification based on spatial statistical modeling and features re-calibration. Sensors, 19.
5. Scene Classification With Recurrent Attention of VHR Remote Sensing Images;Wang;IEEE Trans. Geosci. Remote Sens.,2019