Joint Video Super-Resolution and Frame Interpolation via Permutation Invariance
Author:
Choi Jinsoo1, Oh Tae-Hyun23ORCID
Affiliation:
1. Department of Electrical Engineering, KAIST, Daejeon 34141, Republic of Korea 2. Department of Electrical Engineering and Graduate School of AI, Pohang University of Science and Technology (POSTECH), Pohang 37673, Republic of Korea 3. Department of Artificial Intelligence, Yonsei University, Seoul 03722, Republic of Korea
Abstract
We propose a joint super resolution (SR) and frame interpolation framework that can perform both spatial and temporal super resolution. We identify performance variation according to permutation of inputs in video super-resolution and video frame interpolation. We postulate that favorable features extracted from multiple frames should be consistent regardless of input order if the features are optimally complementary for respective frames. With this motivation, we propose a permutation invariant deep architecture that makes use of the multi-frame SR principles by virtue of our order (permutation) invariant network. Specifically, given two adjacent frames, our model employs a permutation invariant convolutional neural network module to extract “complementary” feature representations facilitating both the SR and temporal interpolation tasks. We demonstrate the effectiveness of our end-to-end joint method against various combinations of the competing SR and frame interpolation methods on challenging video datasets, and thereby we verify our hypothesis.
Funder
National Research Foundation of Korea (NRF) grant funded by the Korea government Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference47 articles.
1. Haris, M., Shakhnarovich, G., and Ukita, N. (2018, January 18–23). Deep back-projection networks for super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 2. Haris, M., Shakhnarovich, G., and Ukita, N. (2019, January 15–20). Recurrent Back-Projection Network for Video Super-Resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA. 3. Lim, B., Son, S., Kim, H., Nah, S., and Mu Lee, K. (2017, January 21–26). Enhanced deep residual networks for single image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA. 4. Wang, X., Chan, K.C., Yu, K., Dong, C., and Change Loy, C. (2019, January 16–17). Edvr: Video restoration with enhanced deformable convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA. 5. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., and Fu, Y. (2018, January 18–23). Residual dense network for image super-resolution. Proceedings of the IEEE Conference on Comput. Vision and Pattern Recogn, Salt Lake City, UT, USA.
|
|