A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application-Reference-Cited by-同舟云学术

A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

Published:2023-04-02 Issue:4 Volume:15 Page:849
ISSN:2073-8994
Container-title:Symmetry
language:en
Short-container-title:Symmetry

Author:

Liu Yiyi¹^ORCID,Wang Yuxin¹^ORCID,Shi Hongjian¹^ORCID

Affiliation:

1. Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, Beijing Normal University—Hong Kong Baptist University United International College, Zhuhai 519087, China

Abstract

Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios.

Funder

BNU-HKBU United International College

Guangdong Higher Education Key Platform and Research Project

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science

Publisher

MDPI AG

Subject

Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2073-8994/15/4/849/pdf

Reference26 articles.

1. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition;Shi;IEEE Trans. Pattern Anal. Mach. Intell.,2017

2. Kalchbrenner, N., Grefenstette, E., and Blunsom, P. (2014, January 24–27). A convolutional neural network for modelling sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, MD, USA.

3. Liu, Y., Chen, H., Shen, C., He, T., Jin, L., and Wang, L. (2020, January 7–12). Real-time scene text detection with differentiable binarization. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.

4. Multi-scale retinex for color image enhancement;Rahman;Proceedings of the International Conference on Image Processing,1996

5. Shi, B., Bai, X., and Yao, C. (2017, January 21–26). Detecting Oriented Text in Natural Images by Linking Segments. Proceedings of the Computer Vision and Pattern Recognition, Honolulu, HI, USA.

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Differential Analysis of Modern Text Spotting Methods : A Systematic Review;International Journal of Scientific Research in Computer Science, Engineering and Information Technology;2024-09-05

2. Data-Driven Strategies for Complex System Forecasts: The Role of Textual Big Data and State-Space Transformers in Decision Support;Systems;2024-05-10

3. Utilizing Artificial Intelligence for Text Segmentation from Images;PRZEGLĄD ELEKTROTECHNICZNY;2024-02-19

4. An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition;IEEE Access;2024

5. Attention-Based Deep Learning Algorithm in Natural Language Processing for Optical Character Recognition;2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques (EASCT);2023-10-20