A Masked-Pre-Training-Based Fast Deep Image Prior Denoising Model-Reference-Cited by-同舟云学术

A Masked-Pre-Training-Based Fast Deep Image Prior Denoising Model

Published:2024-06-12 Issue:12 Volume:14 Page:5125
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Ji Shuichen¹,Xu Shaoping²^ORCID,Cheng Qiangqiang³,Xiao Nan²,Zhou Changfei²,Xiong Minghai²

Affiliation:

1. School of Information Engineering, Nanchang University, Nanchang 330031, China

2. School of Mathematics and Computer Sciences, Nanchang University, Nanchang 330031, China

3. School of Mechanical and Electronic Engineering, Gandong University, Fuzhou 344000, China

Abstract

Compared to supervised denoising models based on deep learning, the unsupervised Deep Image Prior (DIP) denoising approach offers greater flexibility and practicality by operating solely with the given noisy image. However, the random initialization of network input and network parameters in the DIP leads to a slow convergence during iterative training, affecting the execution efficiency heavily. To address this issue, we propose the Masked-Pre-training-Based Fast DIP (MPFDIP) Denoising Model in this paper. We enhance the classical Restormer framework by improving its Transformer core module and incorporating sampling, residual learning, and refinement techniques. This results in a fast network called FRformer (Fast Restormer). The FRformer model undergoes offline supervised training using the masked processing technique for pre-training. For a specific noisy image, the pre-trained FRformer network, with its learned parameters, replaces the UNet network used in the original DIP model. The online iterative training of the replaced model follows the DIP unsupervised training approach, utilizing multi-target images and an adaptive loss function. This strategy further improves the denoising effectiveness of the pre-trained model. Extensive experiments demonstrate that the MPFDIP model outperforms existing mainstream deep-learning-based denoising models in reducing Gaussian noise, mixed Gaussian–Poisson noise, and low-dose CT noise. It also significantly enhances the execution efficiency compared to the original DIP model. This improvement is mainly attributed to the FRformer network’s initialization parameters obtained through masked pre-training, which exhibit strong generalization capabilities for various types and intensities of noise and already provide some denoising effect. Using them as initialization parameters greatly improves the convergence speed of unsupervised iterative training in the DIP. Additionally, the techniques of multi-target images and the adaptive loss function further enhance the denoising process.

Funder

Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-3417/14/12/5125/pdf

Reference51 articles.

1. Chen, Z., Kaushik, P., Shuangfei, Z., Alvin, W., Zhile, R., Alex, S., Alex, C., and Li, F. (2023, January 17–24). AutoFocusFormer: Image Segmentation off the Grid. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

2. Jie, Q., Wu, J., Pengxiang, Y., Ming, L., Ren, Y., Xuefeng, X., Yitong, W., Rui, W., Shilei, W., and Xin, P. (2023, January 17–24). Freeseg: Unified, universal and open-vocabulary image segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

3. Attentional Full-Relation Network for Few-Shot Image Classification;Li;Chin. J. Comput.,2023

4. SCS-Net: Sharpend cosine similarity based neural network for hyperspectral image classification;Ahmad;IEEE Geosci. Remote. Sens. Lett.,2024

5. Channel Attention Embedded Transformer for Image Super-Resolution Reconstruction;Xiong;J. Image Graph. China,2023