Author:
Yue Hongwei,Huang Yufeng,Vong Chi-Man,Jin Yingying,Zeng Zhiqiang,Yu Mingqi,Chen Chuangquan
Abstract
AbstractScene text recognition (STR) has been widely applied in industrial and commercial fields. However, existing methods still face challenges when processing text images with defects such as low contrast, blur, low resolution, and insufficient illumination. These defects are common in actual situations because of diverse text backgrounds in natural scenes and limitations in shooting conditions. To address these challenges, we propose a novel network for noise-robust scene text recognition (NRSTRNet), which comprehensively suppresses the noise in the three critical steps of STR. Specifically, in the text feature extraction stage, NRSTRNet enhances the text-related features through the channel and spatial dimensions and disregards some disturbances from the non-text area, reducing the noise and redundancy in the input image. In the context encoding stage, fine-grained feature coding is proposed to effectively reduce the influence of previous noisy temporal features on current temporal features while simultaneously reducing the impact of partial noise on the overall encoding by sharing contextual feature encoding parameters. In the decoding stage, a self-attention module is added to enhance the connections between different temporal features, thereby leveraging the global information to obtain noise-resistant features. Through these approaches, NRSTRNet can enhance the local semantic information while considering the global semantic information. Experimental results show that the proposed NRSTRNet can improve the ability to characterize text images, enhance stability under the influence of noise, and achieve superior accuracy in text recognition. As a result, our model outperforms SOTA STR models on irregular text recognition benchmarks by 2% on average, and it is exceptionally robust when applied to noisy images.
Funder
Characteristic Innovation Projects of Colleges and Universities of Guangdong Province
Guangdong Basic and Applied Basic Research Foundation
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,General Computer Science
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献