TextScanner: Reading Characters in Order for Robust Scene Text Recognition-Reference-Cited by-同舟云学术

TextScanner: Reading Characters in Order for Robust Scene Text Recognition

Published:2020-04-03 Issue:07 Volume:34 Page:12120-12127
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Wan Zhaoyi,He Minghang,Chen Haoran,Bai Xiang,Yao Cong

Abstract

Driven by deep learning and a large volume of data, scene text recognition has evolved rapidly in recent years. Formerly, RNN-attention-based methods have dominated this field, but suffer from the problem of attention drift in certain situations. Lately, semantic segmentation based algorithms have proven effective at recognizing text of different forms (horizontal, oriented and curved). However, these methods may produce spurious characters or miss genuine characters, as they rely heavily on a thresholding procedure operated on segmentation maps. To tackle these challenges, we propose in this paper an alternative approach, called TextScanner, for scene text recognition. TextScanner bears three characteristics: (1) Basically, it belongs to the semantic segmentation family, as it generates pixel-wise, multi-channel segmentation maps for character class, position and order; (2) Meanwhile, akin to RNN-attention-based methods, it also adopts RNN for context modeling; (3) Moreover, it performs paralleled prediction for character position and class, and ensures that characters are transcripted in the correct order. The experiments on standard benchmark datasets demonstrate that TextScanner outperforms the state-of-the-art methods. Moreover, TextScanner shows its superiority in recognizing more difficult text such as Chinese transcripts and aligning with target characters.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 76 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An ML-aided Approach to Automatically Generate Schematic Symbols in PCB EDA Tools;Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD;2024-09-09

2. SARN: Script-Aware Recognition Network for scene multilingual text recognition;Expert Systems with Applications;2024-09

3. NDOrder: Exploring a novel decoding order for scene text recognition;Expert Systems with Applications;2024-09

4. Irregular text block recognition via decoupling visual, linguistic, and positional information;Pattern Recognition;2024-09

5. Enhancing Text Recognition Performance Through Multi-Dimensional Data Analysis;2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN);2024-07-03