A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

Author:

Liu Yiyi1ORCID,Wang Yuxin1ORCID,Shi Hongjian1ORCID

Affiliation:

1. Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, Beijing Normal University—Hong Kong Baptist University United International College, Zhuhai 519087, China

Abstract

Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios.

Funder

BNU-HKBU United International College

Guangdong Higher Education Key Platform and Research Project

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science

Publisher

MDPI AG

Subject

Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Differential Analysis of Modern Text Spotting Methods : A Systematic Review;International Journal of Scientific Research in Computer Science, Engineering and Information Technology;2024-09-05

2. Data-Driven Strategies for Complex System Forecasts: The Role of Textual Big Data and State-Space Transformers in Decision Support;Systems;2024-05-10

3. Utilizing Artificial Intelligence for Text Segmentation from Images;PRZEGLĄD ELEKTROTECHNICZNY;2024-02-19

4. An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition;IEEE Access;2024

5. Attention-Based Deep Learning Algorithm in Natural Language Processing for Optical Character Recognition;2023 International Conference on Evolutionary Algorithms and Soft Computing Techniques (EASCT);2023-10-20

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3