Head-Free Lightweight Semantic Segmentation with Linear Transformer-Reference-Cited by-同舟云学术

Head-Free Lightweight Semantic Segmentation with Linear Transformer

Published:2023-06-26 Issue:1 Volume:37 Page:516-524
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Dong Bo,Wang Pichao,Wang Fan

Abstract

Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the computational load introduced by the overall structure has long been ignored, which hinders their applications on resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for semantic segmentation, named Adaptive Frequency Transformer (AFFormer). AFFormer adopts a parallel architecture to leverage prototype representations as specific learnable local descriptions which replaces the decoder and preserves the rich image semantics on high-resolution features. Although removing the decoder compresses most of the computation, the accuracy of the parallel structure is still limited by low computational resources. Therefore, we employ heterogeneous operators (CNN and vision Transformer) for pixel embedding and prototype representations to further save computational costs. Moreover, it is very difficult to linearize the complexity of the vision Transformer from the perspective of spatial domain. Due to the fact that semantic segmentation is very sensitive to frequency information, we construct a lightweight prototype learning block with adaptive frequency filter of complexity O(n) to replace standard self attention with O(n^2). Extensive experiments on widely adopted datasets demonstrate that AFFormer achieves superior accuracy while retaining only 3M parameters. On the ADE20K dataset, AFFormer achieves 41.8 mIoU and 4.6 GFLOPs, which is 4.4 mIoU higher than Segformer, with 45% less GFLOPs. On the Cityscapes dataset, AFFormer achieves 78.7 mIoU and 34.4 GFLOPs, which is 2.5 mIoU higher than Segformer with 72.5% less GFLOPs. Code is available at https://github.com/dongbo811/AFFormer.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MicroSeg: Multi-scale fusion learning for microaneurysms segmentation;Biomedical Signal Processing and Control;2024-11

2. DSNet: A dynamic squeeze network for real-time weld seam image segmentation;Engineering Applications of Artificial Intelligence;2024-07

3. Decoupling semantic and localization for semantic segmentation via magnitude-aware and phase-sensitive learning;Information Fusion;2024-07

4. Goal-Oriented Source Coding and Filtering for Vehicular Communications;IEEE Internet of Things Journal;2024-06-01

5. EPSSNet;International Journal on Semantic Web and Information Systems;2024-04-02