PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network-Reference-Cited by-同舟云学术

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

Published:2021-05-18 Issue:4 Volume:35 Page:2782-2790
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Wang Pengfei,Zhang Chengquan,Qi Fei,Liu Shanshan,Zhang Xiaoqiang,Lyu Pengyuan,Han Junyu,Liu Jingtuo,Ding Errui,Shi Guangming

Abstract

The reading of arbitrarily-shaped text has received increasing research attention. However, existing text spotters are mostly built on two-stage frameworks or character-based methods, which suffer from either Non-Maximum Suppression (NMS), Region-of-Interest (RoI) operations, or character-level annotations. In this paper, to address the above problems, we propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time. The PGNet is a single-shot text spotter, where the pixel-level character classification map is learned with proposed PG-CTC loss avoiding the usage of character-level annotations. With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency. Additionally, reasoning the relations between each character and its neighbors, a graph refinement module (GRM) is proposed to optimize the coarse recognition and improve the end-to-end performance. Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed. In particular, in Total-Text, it runs at 46.7 FPS, surpassing the previous spotters with a large margin.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 47 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Turning a CLIP Model Into a Scene Text Spotter;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-09

2. Hyper-Local Deformable Transformers for Text Spotting on Historical Maps;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

3. Generative artificial intelligence: a systematic review and applications;Multimedia Tools and Applications;2024-08-14

4. Text Spotting with a Unified Transformer Decoder;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

5. Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes;2024 IEEE International Conference on Robotics and Automation (ICRA);2024-05-13