Affiliation:
1. School of Electronic and Electrical Engineering Shanghai University of Engineering Science Shanghai China
Abstract
AbstractWhile recent years have witnessed the unprecedented success of deep convolutional neural networks (CNNs) and vision transformers in single‐image super‐resolution (SISR), the degradation assumptions are simple and usually bicubic downsampling. Thus, their performances will drop dramatically when the actual degradation does not match this assumption, and they lack the capability to handle multiple degradations (e.g. Gaussian noise, bicubic downsizing, and salt & pepper noise). To address the issues, in this paper, the authors propose a joint SR model (JIRSR) that can effectively handle multiple degradations in a single model. Specifically, the authors build the parallel Transformer and CNN branches that complement each other through bidirectional feature fusion. Moreover, the authors also adopt a random permutation of different kinds of noise and resizing operations to build the training datasets. Extensive experiments on classical SR, denoising, and multiple degradation removal demonstrate that the authors’ JIRSR achieves state‐of‐the‐art (SOTA) performance on public benchmarks. Concretely, the authors’ JIRSR outperforms the second‐best model by 0.23 to 0.74 dB for multiple degradations removal and is 0.20 to 0.36 dB higher than the SOTA methods on the Urban100 dataset under the ×4 SR task.
Funder
National Key Research and Development Program of China
Publisher
Institution of Engineering and Technology (IET)