Scene Text Recognition Based on Bidirectional LSTM and Deep Neural Network-Reference-Cited by-同舟云学术

Scene Text Recognition Based on Bidirectional LSTM and Deep Neural Network

Published:2021-11-23 Issue: Volume:2021 Page:1-11
ISSN:1687-5273
Container-title:Computational Intelligence and Neuroscience
language:en
Short-container-title:Computational Intelligence and Neuroscience

Author:

Kantipudi MVV Prasad¹^ORCID,Kumar Sandeep²^ORCID,Kumar Jha Ashish³^ORCID

Affiliation:

1. Department of E&TC, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune 412115, India

2. Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India

3. Nepal Engineering College, Kathmandu, Nepal

Abstract

Deep learning is a subfield of artificial intelligence that allows the computer to adopt and learn some new rules. Deep learning algorithms can identify images, objects, observations, texts, and other structures. In recent years, scene text recognition has inspired many researchers from the computer vision community, and still, it needs improvement because of the poor performance of existing scene recognition algorithms. This research paper proposed a novel approach for scene text recognition that integrates bidirectional LSTM and deep convolution neural networks. In the proposed method, first, the contour of the image is identified and then it is fed into the CNN. CNN is used to generate the ordered sequence of the features from the contoured image. The sequence of features is now coded using the Bi-LSTM. Bi-LSTM is a handy tool for extracting the features from the sequence of words. Hence, this paper combines the two powerful mechanisms for extracting the features from the image, and contour-based input image makes the recognition process faster, which makes this technique better compared to existing methods. The results of the proposed methodology are evaluated on MSRATD 50 dataset, SVHN dataset, vehicle number plate dataset, SVT dataset, and random datasets, and the accuracy is 95.22%, 92.25%, 96.69%, 94.58%, and 98.12%, respectively. According to quantitative and qualitative analysis, this approach is more promising in terms of accuracy and precision rate.

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

Link

http://downloads.hindawi.com/journals/cin/2021/2676780.pdf

Reference32 articles.

1. A Detailed Analysis of Optical Character Recognition Technology

2. Text detection and recognition in imagery: a survey;Q. Ye;IEEE Transactions on Pattern Analysis and Machine Intelligence,2015

3. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients

4. End-to-end text recognition with hybrid HMM max out models;O. Alsharif

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Intelligent detection and mileage positioning of multiple distresses using two-step deep learning;Automation in Construction;2024-10

2. Optimized Fake News Classification: Leveraging Ensembles Learning and Parameter Tuning in Machine and Deep Learning Methods;Applied Artificial Intelligence;2024-07-30

3. Identifying Early Signs of Bipolar Disorder Risk by Food Habit Analysis in Forensic Using Machine Learning;2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE);2024-05-09

4. Predicting Personality Traits of Introverts and Extroverts for Forensic Applications;2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE);2024-05-09

5. Enhancing Video Anomaly Detection Using Spatio-Temporal Autoencoders and Convolutional LSTM Networks;SN Computer Science;2024-01-11