Video-Restoration-Net: Deep Generative Model with Non-Local Network for Inpainting and Super-Resolution Tasks
-
Published:2023-09-05
Issue:18
Volume:13
Page:10001
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Zheng Yuanfeng1, Yan Yuchen1, Jiang Hao1
Affiliation:
1. School of Electronic Information, Wuhan University, Wuhan 430072, China
Abstract
Although deep learning-based approaches for video processing have been extensively investigated, the lack of generality in network construction makes it challenging for practical applications, particularly in video restoration. As a result, this paper presents a universal video restoration model that can simultaneously tackle video inpainting and super-resolution tasks. The network, called Video-Restoration-Net (VRN), consists of four components: (1) an encoder to extract features from each frame, (2) a non-local network that recombines features from adjacent frames or different locations of a given frame, (3) a decoder to restore the coarse video from the output of a non-local block, and (4) a refinement network to refine the coarse video on the frame level. The framework is trained in a three-step pipeline to improve training stability for both tasks. Specifically, we first suggest an automated technique to generate full video datasets for super-resolution reconstruction and another complete-incomplete video dataset for inpainting, respectively. A VRN is then trained to inpaint the incomplete videos. Meanwhile, the full video datasets are adopted to train another VRN frame-wisely and validate it against authoritative datasets. We show quantitative comparisons with several baseline models, achieving 40.5042 dB/0.99473 on PSNR/SSIM in the inpainting task, while during the SR task we obtained 28.41 dB/0.7953 and 27.25/0.8152 on BSD100 and Urban100, respectively. The qualitative comparisons demonstrate that our proposed model is able to complete masked regions and implement super-resolution reconstruction in videos of high quality. Furthermore, the above results show that our method has greater versatility both in video inpainting and super-resolution tasks compared to recent models.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference52 articles.
1. PatchMatch: A randomized correspondence algorithm for structural image editing;Barnes;ACM Trans. Graph.,2009 2. Scene completion using millions of photographs;Hays;ACM Trans. Graph. ToG,2007 3. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., and Efros, A.A. (2016, January 27–30). Context encoders: Feature learning by inpainting. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 4. Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., and Li, H. (2017, January 21–26). High-resolution image inpainting using multi-scale neural patch synthesis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA. 5. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T.S. (November, January 27). Free-form image inpainting with gated convolution. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.
|
|