Y-Net: Dual-branch Joint Network for Semantic Segmentation-Reference-Cited by-同舟云学术

Y-Net: Dual-branch Joint Network for Semantic Segmentation

Published:2021-11-30 Issue:4 Volume:17 Page:1-22
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Chen Yizhen¹,Hu Haifeng¹

Affiliation:

1. School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou, People’s Republic of China

Abstract

Most existing segmentation networks are built upon a “ U -shaped” encoder–decoder structure, where the multi-level features extracted by the encoder are gradually aggregated by the decoder. Although this structure has been proven to be effective in improving segmentation performance, there are two main drawbacks. On the one hand, the introduction of low-level features brings a significant increase in calculations without an obvious performance gain. On the other hand, general strategies of feature aggregation such as addition and concatenation fuse features without considering the usefulness of each feature vector, which mixes the useful information with massive noises. In this article, we abandon the traditional “ U -shaped” architecture and propose Y-Net, a dual-branch joint network for accurate semantic segmentation. Specifically, it only aggregates the high-level features with low-resolution and utilizes the global context guidance generated by the first branch to refine the second branch. The dual branches are effectively connected through a Semantic Enhancing Module, which can be regarded as the combination of spatial attention and channel attention. We also design a novel Channel-Selective Decoder (CSD) to adaptively integrate features from different receptive fields by assigning specific channelwise weights, where the weights are input-dependent. Our Y-Net is capable of breaking through the limit of singe-branch network and attaining higher performance with less computational cost than “ U -shaped” structure. The proposed CSD can better integrate useful information and suppress interference noises. Comprehensive experiments are carried out on three public datasets to evaluate the effectiveness of our method. Eventually, our Y-Net achieves state-of-the-art performance on PASCAL VOC 2012, PASCAL Person-Part, and ADE20K dataset without pre-training on extra datasets.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Guangdong Province

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3460940

Reference57 articles.

1. Face and Hair Region Labeling Using Semi-Supervised Spectral Clustering-Based Multiple Segmentations

2. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

3. Spatio-temporal saliency networks for dynamic saliency prediction;Bak Cagdas;IEEE Trans. Multimedia,2017

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SNIPPET: A Framework for Subjective Evaluation of Visual Explanations Applied to DeepFake Detection;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-06-13

2. ConvMedSegNet: A multi-receptive field depthwise convolutional neural network for medical image segmentation;Computers in Biology and Medicine;2024-06

3. WaRENet: A Novel Urban Waterlogging Risk Evaluation Network;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-05-16

4. Learning Nighttime Semantic Segmentation the Hard Way;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-05-16

5. 3V3D: Three-View Contextual Cross-slice Difference Three-dimensional Medical Image Segmentation Adversarial Network;ACM Transactions on Multimedia Computing, Communications, and Applications;2023-07-12