Learned Video Compression with Adaptive Temporal Prior and Decoded Motion-aided Quality Enhancement

Author:

Yang Jiayu1ORCID,Yang Chunhui1ORCID,Xiong Fei1ORCID,Zhai Yongqi2ORCID,Wang Ronggang3ORCID

Affiliation:

1. Peking University Shenzhen Graduate School, Shenzhen, China

2. Peking University Shenzhen Graduate School, Shenzhen, China and Peng Cheng Laboratory, Shenzhen, China

3. School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, China, Peng Cheng Laboratory, Shenzhen, China, and MIGU Video Co., Ltd., Shenzhen, China

Abstract

Learned video compression has drawn great attention and shown promising compression performance recently. In this article, we focus on the two components in the learned video compression framework, the conditional entropy model and quality enhancement module, to improve compression performance. Specifically, we propose an adaptive spatial-temporal entropy model for image, motion, and residual compression, which introduces a temporal prior to reduce temporal redundancy of latents and an additional modulated mask to evaluate the similarity and perform refinement. In addition, a quality enhancement module is proposed for predicted frame and reconstructed frame to improve frame quality and reduce the bitrate cost of residual coding. The module reuses decoded optical flow as a motion prior and utilizes deformable convolution to mine high-quality information from the reference frame in a bit-free manner. The two proposed coding tools are integrated into a pixel-domain residual coding–based compression framework to evaluate their effectiveness. Experimental results demonstrate that our framework achieves competitive compression performance in the low-delay scenario compared with recent learning-based methods and traditional H.265/HEVC in terms of Peak Signal-to-Noise Ratio (PSNR) and Multi-Scale Structural Similarity Index (MS-SSIM). The code is available at OpenLVC.

Funder

Outstanding Talents Training Fund in Shenzhen

Shenzhen Science and Technology Program–Shenzhen Cultivation of Excellent Scientific and Technological Innovation Talents project

Shenzhen Science and Technology Program–Shenzhen Hong Kong joint funding project

National Natural Science Foundation of China

MIGU-PKU Meta Vision Technology Innovation Lab

Publisher

Association for Computing Machinery (ACM)

Reference60 articles.

1. Eirikur Agustsson, Fabian Mentzer, Michael Tschannen, Lukas Cavigelli, Radu Timofte, Luca Benini, and Luc V. Gool. 2017. Soft-to-hard vector quantization for end-to-end learning compressible representations. In Advances in Neural Information Processing Systems. 1141–1151.

2. Scale-Space Flow for End-to-End Optimized Video Compression

3. Density modeling of images using a generalized normalization transformation;Ballé Johannes;arXiv preprint arXiv:1511.06281,2015

4. Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimized image compression. In International Conference on Learning Representations.

5. Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. In International Conference on Learning Representations.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3