STDF: Spatio-Temporal Deformable Fusion for Video Quality Enhancement on Embedded Platforms

Author:

Deng Jianing1ORCID,Dong Shunjie2ORCID,Chen Lvcheng3ORCID,Hu Jingtong1ORCID,Zhuo Cheng3ORCID

Affiliation:

1. University of Pittsburgh, Pittsburgh, USA

2. Department of Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China and College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China

3. Zhejiang University, Hangzhou, China

Abstract

With the development of embedded systems and deep learning, it is feasible to combine them for offering various and convenient human-centered services, which is based on high-quality (HQ) videos. However, due to the limit of video traffic load and unavoidable noise, the visual quality of an image from an edge camera may degrade significantly, influencing the overall video and service quality. To maintain video stability, video quality enhancement (QE), aiming at recovering HQ videos from their distorted low-quality (LQ) sources, has aroused increasing attention in recent years. The key challenge for video QE lies in how to effectively aggregate complementary information from multiple frames (i.e., temporal fusion). To handle diverse motion in videos, existing methods commonly apply motion compensation before the temporal fusion. However, the motion field estimated from the distorted LQ video tends to be inaccurate and unreliable, thereby resulting in ineffective fusion and restoration. In addition, motion estimation for consecutive frames is generally conducted in a pairwise manner, which leads to expensive and inefficient computation. In this article, we propose a fast yet effective temporal fusion scheme for video QE by incorporating a novel Spatio-Temporal Deformable Convolution (STDC) to simultaneously compensate motion and aggregate temporal information. Specifically, the proposed temporal fusion scheme takes a target frame along with its adjacent reference frames as input to jointly estimate an offset field to deform the spatio-temporal sampling positions of convolution. As a result, complementary information from multiple frames can be fused within the STDC operation in one forward pass. Extensive experimental results on three benchmark datasets show that our method performs favorably to the state of the art in terms of accuracy and efficiency.

Funder

Zhejiang Provincial Key R&D

Publisher

Association for Computing Machinery (ACM)

Reference55 articles.

1. Study of Temporal Effects on Subjective Video Quality of Experience

2. Fukun Bi, Jiayi Sun, Mingyang Lei, Yanping Wang, and Xiaodi Sun. 2020. Remote sensing target tracking for UAV aerial videos based on multi-frequency feature enhancement. In Proceedings of the 2020 IEEE International Geoscience and Remote Sensing Symposium(IGARSS’20). IEEE, 956–959.

3. VQeg validation and ITU standardization of objective perceptual video quality metrics [Standards in a Nutshell]

4. A Non-Local Algorithm for Image Denoising

5. Jose Caballero, Christian Ledig, Andrew Aitken, Alejandro Acosta, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In Proceedings of the 2017 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’17). 4778–4787.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3