iDAM: Iteratively Trained Deep In-loop Filter with Adaptive Model Selection-Reference-Cited by-同舟云学术

iDAM: Iteratively Trained Deep In-loop Filter with Adaptive Model Selection

Published:2023-01-23 Issue:1s Volume:19 Page:1-22
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Li Yue¹^ORCID,Zhang Li¹^ORCID,Zhang Kai¹^ORCID

Affiliation:

1. Bytedance Inc., San Diego, CA, USA

Abstract

As a rapid development of neural-network-based machine learning algorithms, deep learning methods are being tentatively used in a much wider range than well-known artificial intelligence applications such as face recognition or auto-driving. Recently, deep learning models are investigated intensively to improve the compression efficiency for video coding, especially at the in-loop filtering stage. Although deep learning-based in-loop filtering methods in prior arts have already shown a remarkable potential capability in video coding, content propagation issue is still not well recognized and addressed yet. Content propagation is the fact that contents of reference frames are propagated to frames referring to them, which typically leads to over-filtering issues. In this article, we develop an iteratively trained deep in-loop filter with adaptive model selection (iDAM) to address the content propagation issue. First, we propose an iterative training scheme, which enables the network to gradually take into account the impacts of content propagation. Second, we propose a filter selection mechanism, i.e., allowing a block to select from a set of candidate filters with different filtering strengths. Besides, we propose a novel approach to design a conditional in-loop filtering method that can deal with multiple quality levels with a single model and serve the functionality of filter selection by modifying the input parameters. Extensive experiments on top of the latest video coding standard (Versatile Video Coding, VVC) have been conducted to evaluate the proposed techniques. Compared with VTM-11.0, our scheme achieves a new state-of-the-art, leading to {7.91%, 20.25%, 20.44%}, {11.64%, 26.40%, 26.50%}, and {10.97%, 26.63%, 26.77%} BD-rate reductions on average for {Y, Cb, Cr} under all-intra, random-access, and low-delay configurations, respectively. As far as we know, our proposed iDAM scheme provides the highest coding performance compared to all existing solutions. In addition, the syntax elements of the proposed scheme were adopted at the 76th meeting of Audio Video coding Standard (AVS) held this year.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3529107

Reference53 articles.

1. Gisle Bjontegaard. 2001. Calcuation of Average PSNR Differences Between RD-curves. Technical Report VCEG-M33. VCEG.

2. AHG11: Replacing SAO in-loop filter with neural networks;Bordes Philippe;JVET-V0092,2021

3. JVET common test conditions and software reference configurations for SDR video;Bossen Frank;JVET-K1010,2018

4. Versatile video coding (draft 10);Bross Benjamin;JVET-S2001,2020

5. EE-2.1.5: In-loop filtering based on neural network;Chen Wei;JVET-U0101,2021

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AEGANAuth: Autoencoder GAN-Based Continuous Authentication With Conditional Variational Autoencoder Generative Adversarial Network;IEEE Internet of Things Journal;2024-08-15

2. Survey on Visual Signal Coding and Processing With Generative Models: Technologies, Standards, and Optimization;IEEE Journal on Emerging and Selected Topics in Circuits and Systems;2024-06

3. A Reconfigurable Framework for Neural Network Based Video In-Loop Filtering;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-03-08

4. Reinforcement Learning for SAR Target Orientation Inference With the Differentiable SAR Renderer;IEEE Transactions on Geoscience and Remote Sensing;2024

5. NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines;2023 IEEE International Symposium on Multimedia (ISM);2023-12-11