MANGO: A Mask Attention Guided One-Stage Scene Text Spotter


Qiao Liang,Chen Ying,Cheng Zhanzhan,Xu Yunlu,Niu Yi,Pu Shiliang,Wu Fei


Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications. Most methods attempt to develop various region of interest (RoI) operations to concatenate the detection part and the sequence recognition part into a two-stage text spotting framework. However, in such framework, the recognition part is highly sensitive to the detected results (e.g., the compactness of text contours). To address this problem, in this paper, we propose a novel Mask AttentioN Guided One-stage text spotting framework named MANGO, in which character sequences can be directly recognized without RoI operation. Concretely, a position-aware mask attention module is developed to generate attention weights on each text instance and its characters. It allows different text instances in an image to be allocated on different feature map channels which are further grouped as a batch of instance features. Finally, a lightweight sequence decoder is applied to generate the character sequences. It is worth noting that MANGO inherently adapts to arbitrary-shaped text spotting and can be trained end-to-end with only coarse position information (e.g., rectangular bounding box) and text annotations. Experimental results show that the proposed method achieves competitive and even new state-of-the-art performance on both regular and irregular text spotting benchmarks, i.e., ICDAR 2013, ICDAR 2015, Total-Text, and SCUT-CTW1500.


Association for the Advancement of Artificial Intelligence (AAAI)


General Medicine

Cited by 25 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Inverse-Like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling;IEEE Transactions on Image Processing;2024

2. SPTS v2: Single-Point Scene Text Spotting;IEEE Transactions on Pattern Analysis and Machine Intelligence;2023-12

3. MS-YOLOv5: a lightweight algorithm for strawberry ripeness detection based on deep learning;Systems Science & Control Engineering;2023-11-29

4. Text Spotting of Electrical Diagram Based on Improved PP-OCRv3;Communications in Computer and Information Science;2023-11-27

5. CommuSpotter: Scene Text Spotting with Multi-Task Communication;Applied Sciences;2023-11-21







Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3