AB-LSTM-Reference-Cited by-同舟云学术

AB-LSTM

Published:2020-01-10 Issue:4 Volume:15 Page:1-23
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Liu Zhandong¹,Zhou Wengang¹,Li Houqiang¹

Affiliation:

1. University of Science and Technology of China, Shushan District, Hefei, China

Abstract

Detection of scene text in arbitrary shapes is a challenging task in the field of computer vision. Most existing scene text detection methods exploit the rectangle/quadrangular bounding box to denote the detected text, which fails to accurately fit text with arbitrary shapes, such as curved text. In addition, recent progress on scene text detection has benefited from Fully Convolutional Network. Text cues contained in multi-level convolutional features are complementary for detecting scene text objects. How to explore these multi-level features is still an open problem. To tackle the above issues, we propose an Attention-based Bidirectional Long Short-Term Memory (AB-LSTM) model for scene text detection. First, word stroke regions (WSRs) and text center blocks (TCBs) are extracted by two AB-LSTM models, respectively. Then, the union of WSRs and TCBs are used to represent text objects. To verify the effectiveness of the proposed method, we perform experiments on four public benchmarks: CTW1500, Total-text, ICDAR2013, and MSRA-TD500, and compare it with existing state-of-the-art methods. Experiment results demonstrate that the proposed method can achieve competitive results, and well handle scene text objects with arbitrary shapes (i.e., curved, oriented, and horizontal forms).

Funder

Youth Innovation Promotion Association of the Chinese Academy of Sciences

National Natural Science Foundation of China

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3356728

Reference54 articles.

1. Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. Retrieved from Arxiv Preprint Arxiv:1409.0473 (2014). Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. Retrieved from Arxiv Preprint Arxiv:1409.0473 (2014).

2. Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition

3. Paying more attention to saliency: Image captioning with saliency and context attention. ACM Trans. Multimedia Comput., Commun;Cornia Marcella;Applic.,2018

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Buffer-text: Detecting arbitrary shaped text in natural scene image;Engineering Applications of Artificial Intelligence;2024-04

2. Multimodal Visual-Semantic Representations Learning for Scene Text Recognition;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-03-27

3. Yi printed character recognition based on deep learning;Procedia Computer Science;2024

4. DC-PSENet: a novel scene text detection method integrating double ResNet-based and changed channels recursive feature pyramid;The Visual Computer;2023-09-27

5. Sentiment Analysis of Comment Texts on Online Courses Based on Hierarchical Attention Mechanism;Applied Sciences;2023-03-26