Multi-Stage Network for Event-Based Video Deblurring with Residual Hint Attention
Author:
Kim Jeongmin1ORCID, Jung Yong Ju1ORCID
Affiliation:
1. School of Computing, Gachon University, Seongnam 13120, Republic of Korea
Abstract
Video deblurring aims at removing the motion blur caused by the movement of objects or camera shake. Traditional video deblurring methods have mainly focused on frame-based deblurring, which takes only blurry frames as the input to produce sharp frames. However, frame-based deblurring has shown poor picture quality in challenging cases of video restoration where severely blurred frames are provided as the input. To overcome this issue, recent studies have begun to explore the event-based approach, which uses the event sequence captured by an event camera for motion deblurring. Event cameras have several advantages compared to conventional frame cameras. Among these advantages, event cameras have a low latency in imaging data acquisition (0.001 ms for event cameras vs. 10 ms for frame cameras). Hence, event data can be acquired at a high acquisition rate (up to one microsecond). This means that the event sequence contains more accurate motion information than video frames. Additionally, event data can be acquired with less motion blur. Due to these advantages, the use of event data is highly beneficial for achieving improvements in the quality of deblurred frames. Accordingly, the results of event-based video deblurring are superior to those of frame-based deblurring methods, even for severely blurred video frames. However, the direct use of event data can often generate visual artifacts in the final output frame (e.g., image noise and incorrect textures), because event data intrinsically contain insufficient textures and event noise. To tackle this issue in event-based deblurring, we propose a two-stage coarse-refinement network by adding a frame-based refinement stage that utilizes all the available frames with more abundant textures to further improve the picture quality of the first-stage coarse output. Specifically, a coarse intermediate frame is estimated by performing event-based video deblurring in the first-stage network. A residual hint attention (RHA) module is also proposed to extract useful attention information from the coarse output and all the available frames. This module connects the first and second stages and effectively guides the frame-based refinement of the coarse output. The final deblurred frame is then obtained by refining the coarse output using the residual hint attention and all the available frame information in the second-stage network. We validated the deblurring performance of the proposed network on the GoPro synthetic dataset (33 videos and 4702 frames) and the HQF real dataset (11 videos and 2212 frames). Compared to the state-of-the-art method (D2Net), we achieved a performance improvement of 1 dB in PSNR and 0.05 in SSIM on the GoPro dataset, and an improvement of 1.7 dB in PSNR and 0.03 in SSIM on the HQF dataset.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference54 articles.
1. Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., and Wang, O. (2017, January 21–26). Deep video deblurring for hand-held cameras. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA. 2. Kim, T.H., Lee, K.M., Scholkopf, B., and Hirsch, M. (2017, January 22–29). Online video deblurring via dynamic temporal blending network. Proceedings of the IEEE/CVF International Conference on Computer Vision, Venice, Italy. 3. Kim, T.H., Sajjadi, M.S., Hirsch, M., and Scholkopf, B. (2018, January 8–14). Spatio-temporal transformer network for video restoration. Proceedings of the European Conference on Computer Vision, Munich, Germany. 4. Shen, Z., Wang, W., Lu, X., Shen, J., Ling, H., Xu, T., and Shao, L. (November, January 27). Human-aware motion deblurring. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea. 5. Zhou, S., Zhang, J., Pan, J., Xie, H., Zuo, W., and Ren, J. (2018, January 8–14). Spatio-temporal filter adaptive network for video deblurring. Proceedings of the IEEE/CVF International Conference on Computer Vision, Munich, Germany.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|