Learning Nighttime Semantic Segmentation the Hard Way-Reference-Cited by-同舟云学术

Learning Nighttime Semantic Segmentation the Hard Way

Published:2024-05-16 Issue:7 Volume:20 Page:1-23
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Liu Wenxi¹^ORCID,Cai Jiaxin¹^ORCID,Li Qi¹^ORCID,Liao Chenyang¹^ORCID,Cao Jingjing²^ORCID,He Shengfeng³^ORCID,Yu Yuanlong¹^ORCID

Affiliation:

1. College of Computer and Data Science, Fuzhou University, Fuzhou, China

2. School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan, China

3. Singapore Management University, Singapore, Singapore

Abstract

Nighttime semantic segmentation is an important but challenging research problem for autonomous driving. The major challenges lie in the small objects or regions from the under-/over-exposed areas or suffer from motion blur caused by the camera deployed on moving vehicles. To resolve this, we propose a novel hard-class-aware module that bridges the main network for full-class segmentation and the hard-class network for segmenting aforementioned hard-class objects. In specific, it exploits the shared focus of hard-class objects from the dual-stream network, enabling the contextual information flow to guide the model to concentrate on the pixels that are hard to classify. In the end, the estimated hard-class segmentation results will be utilized to infer the final results via an adaptive probabilistic fusion refinement scheme. Moreover, to overcome over-smoothing and noise caused by extreme exposures, our model is modulated by a carefully crafted pretext task of constructing an exposure-aware semantic gradient map, which guides the model to faithfully perceive the structural and semantic information of hard-class objects while mitigating the negative impact of noises and uneven exposures. In experiments, we demonstrate that our unique network design leads to superior segmentation performance over existing methods, featuring the strong ability of perceiving hard-class objects under adverse conditions.

Funder

National Natural Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3650032

Reference74 articles.

1. Mahmoud Afifi, Konstantinos G. Derpanis, Bjorn Ommer, and Michael S. Brown. 2021. Learning multi-scale photo exposure correction. In Proceedings of CVPR. 9157–9167.

2. BEiT: BERT pre-training of image transformers;Bao Hangbo;arXiv preprint arXiv:2106.08254,2021

3. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

4. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of ECCV.

5. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of ICML. 1597–1607.