Exploring Neighbor Correspondence Matching for Multiple-hypotheses Video Frame Synthesis-Reference-Cited by-同舟云学术

Exploring Neighbor Correspondence Matching for Multiple-hypotheses Video Frame Synthesis

Published:2024-01-11 Issue:4 Volume:20 Page:1-20
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Jia Zhaoyang¹^ORCID,Lu Yan²^ORCID,Li Houqiang¹^ORCID

Affiliation:

1. University of Science and Technology of China, China

2. Microsoft Research Asia, China

Abstract

Video frame synthesis, which consists of interpolation and extrapolation , is an essential video processing technique that can be applied to various scenarios. However, most existing methods cannot handle small objects or large motion well, especially in high-resolution videos such as 4K videos. To eliminate such limitations, we introduce a neighbor correspondence matching (NCM) algorithm for flow-based frame synthesis. Since the current frame is not available in video frame synthesis, NCM is performed in a current-frame-agnostic fashion to establish multi-scale correspondences in the spatial-temporal neighborhoods of each pixel. Based on the powerful motion representation capability of NCM, we propose a heterogeneous coarse-to-fine scheme for intermediate flow estimation. The coarse-scale and fine-scale modules are trained progressively, making NCM computationally efficient and robust to large motions. We further explore the mechanism of NCM and find that neighbor correspondence is powerful, since it provides multiple-hypotheses motion information for synthesis. Based on this analysis, we introduce a multiple-hypotheses estimation process for video frame extrapolation, resulting in a more robust framework, NCM-MH. Experimental results show that NCM and NCM-MH achieve 31.63 and 28.08 dB for interpolation and extrapolation on the most challenging X4K1000FPS benchmark, outperforming all the other state-of-the-art methods that use two reference frames as input.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3633780

Reference44 articles.

1. Depth-Aware Video Frame Interpolation

2. Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement;Bao Wenbo;IEEE Trans. Pattern Anal. Mach. Intell.,2019

3. Tim Brooks and Jonathan T Barron. 2019. Learning to synthesize motion blur. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6840–6848.

4. Ho Kei Cheng Yu-Wing Tai and Chi-Keung Tang. 2021. Rethinking space-time networks with improved memory coverage for efficient video object segmentation. Advances in Neural Information Processing Systems 34 (2021) 11781–11794.

5. Channel Attention Is All You Need for Video Frame Interpolation