Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow-Reference-Cited by-同舟云学术

Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

Published:2020-04-03 Issue:07 Volume:34 Page:10713-10720
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Ding Mingyu,Wang Zhe,Zhou Bolei,Shi Jianping,Lu Zhiwu,Luo Ping

Abstract

A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 43 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adaptify: A Refined Test-Time Adaptation Scheme for Frame Classification Consistency in Atrophic Gastritis Videos;2024 IEEE International Symposium on Biomedical Imaging (ISBI);2024-05-27

2. Video Generalized Semantic Segmentation via Non-Salient Feature Reasoning and Consistency;Knowledge-Based Systems;2024-05

3. Video object segmentation via couple streams and feature memory;IET Image Processing;2024-04-17

4. MCFNet: Multi-Attentional Class Feature Augmentation Network for Real-Time Scene Parsing;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-03-08

5. How to track and segment fish without human annotations: a self-supervised deep learning approach;Pattern Analysis and Applications;2024-02-23