KPN-MFI: A Kernel Prediction Network with Multi-frame Interaction for Video Inverse Tone Mapping-Reference-Cited by-同舟云学术

KPN-MFI: A Kernel Prediction Network with Multi-frame Interaction for Video Inverse Tone Mapping

Published:2022-07 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Cao Gaofeng¹²,Zhou Fei³²⁴⁵⁶,Yan Han⁷,Wang Anjie¹²,Fan Leidong¹²

Affiliation:

1. Peking University, Shenzhen

2. Peng Cheng Laboratory

3. Shenzhen University

4. Guangdong Key Laboratory of Intelligent Information Processing

5. Shenzhen Key Laboratory of Digital Creative Technology

6. Shenzhen Institute for Artificial Intelligence and Robotics for Society (AIRS)

7. Harbin Institute of Technology, Shenzhen

Abstract

Up to now, the image-based inverse tone mapping (iTM) models have been widely investigated, while there is little research on video-based iTM methods. It would be interesting to make use of these existing image-based models in the video iTM task. However, directly transferring the imagebased iTM models to video data without modeling spatial-temporal information remains nontrivial and challenging. Considering both the intra-frame quality and the inter-frame consistency of a video, this article presents a new video iTM method based on a kernel prediction network (KPN), which takes advantage of multi-frame interaction (MFI) module to capture temporal-spatial information for video data. Specifically, a basic encoder-decoder KPN, essentially designed for image iTM, is trained to guarantee the mapping quality within each frame. More importantly, the MFI module is incorporated to capture temporal-spatial context information and preserve the inter-frame consistency by exploiting the correction between adjacent frames. Notably, we can readily extend any existing image iTM models to video iTM ones by involving the proposed MFI module. Furthermore, we propose an inter-frame brightness consistency loss function based on the Gaussian pyramid to reduce the video temporal inconsistency. Extensive experiments demonstrate that our model outperforms state-ofthe-art image and video-based methods. The code is available at https://github.com/caogaofeng/KPNMFI.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Removing Banding Artifacts in HDR Videos Generated From Inverse Tone Mapping;IEEE Transactions on Broadcasting;2024-06

2. A Database and Model for the Visual Quality Assessment of Super-Resolution Videos;IEEE Transactions on Broadcasting;2024-06

3. A Dataset and Model for the Visual Quality Assessment of Inversely Tone-Mapped HDR Videos;IEEE Transactions on Image Processing;2024

4. Redistributing the Precision and Content in 3D-LUT-based Inverse Tone-mapping for HDR/WCG Display;Proceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production;2023-11-30

5. Video Inverse Tone Mapping Network with Luma and Chroma Mapping;Proceedings of the 31st ACM International Conference on Multimedia;2023-10-26