Affiliation:
1. Bytedance Inc., San Diego, CA, USA
Abstract
As a rapid development of neural-network-based machine learning algorithms, deep learning methods are being tentatively used in a much wider range than well-known artificial intelligence applications such as face recognition or auto-driving. Recently, deep learning models are investigated intensively to improve the compression efficiency for video coding, especially at the in-loop filtering stage. Although deep learning-based in-loop filtering methods in prior arts have already shown a remarkable potential capability in video coding, content propagation issue is still not well recognized and addressed yet. Content propagation is the fact that contents of reference frames are propagated to frames referring to them, which typically leads to over-filtering issues. In this article, we develop an iteratively trained deep in-loop filter with adaptive model selection (iDAM) to address the content propagation issue. First, we propose an iterative training scheme, which enables the network to gradually take into account the impacts of content propagation. Second, we propose a filter selection mechanism, i.e., allowing a block to select from a set of candidate filters with different filtering strengths. Besides, we propose a novel approach to design a conditional in-loop filtering method that can deal with multiple quality levels with a single model and serve the functionality of filter selection by modifying the input parameters. Extensive experiments on top of the latest video coding standard (Versatile Video Coding, VVC) have been conducted to evaluate the proposed techniques. Compared with VTM-11.0, our scheme achieves a new state-of-the-art, leading to {7.91%, 20.25%, 20.44%}, {11.64%, 26.40%, 26.50%}, and {10.97%, 26.63%, 26.77%} BD-rate reductions on average for {Y, Cb, Cr} under all-intra, random-access, and low-delay configurations, respectively. As far as we know, our proposed iDAM scheme provides the highest coding performance compared to all existing solutions. In addition, the syntax elements of the proposed scheme were adopted at the 76th meeting of Audio Video coding Standard (AVS) held this year.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture
Reference53 articles.
1. Gisle Bjontegaard. 2001. Calcuation of Average PSNR Differences Between RD-curves. Technical Report VCEG-M33. VCEG.
2. AHG11: Replacing SAO in-loop filter with neural networks;Bordes Philippe;JVET-V0092,2021
3. JVET common test conditions and software reference configurations for SDR video;Bossen Frank;JVET-K1010,2018
4. Versatile video coding (draft 10);Bross Benjamin;JVET-S2001,2020
5. EE-2.1.5: In-loop filtering based on neural network;Chen Wei;JVET-U0101,2021
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献