T‐Skeleton: Accurate scene text detection via instance‐aware skeleton embedding-Reference-Cited by-同舟云学术

T‐Skeleton: Accurate scene text detection via instance‐aware skeleton embedding

Published:2024-02-21 Issue:6 Volume:18 Page:1491-1503
ISSN:1751-9659
Container-title:IET Image Processing
language:en
Short-container-title:IET Image Processing

Author:

Li Haiyan¹²^ORCID,Hu Xingfei¹,Lu Hongtao¹^ORCID

Affiliation:

1. Department of Computer Science and Engineering MOE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University Shanghai China

2. Department of Computer Science and Technology Kashi University Kashi China

Abstract

AbstractExisting segmentation‐based methods have made considerable progress in arbitrarily shaped text detection due to the advantage of dealing with shape variation. However, there still exist challenges to detecting accurate text instances with dense layouts, inaccurate annotations, and complex backgrounds. Many recent works have focused on improving arbitrary boundary prediction, but it may be difficult to accurately distinguish each instance of dense layouts because their boundary pixels may be mistakenly classified to produce inaccurate results (i.e., adhesive texts) with inaccurate annotation and complex backgrounds. Considering the local and long‐range dependencies, this paper proposes an efficient text detector, namely T‐Skeleton, to obtain more reliable segmentation detections. In the spirit of object skeletonization, we introduce the text instance skeleton highlighting the semantically significant structure (similar to the skeleton of a fish) to explicitly capture the long‐range dependencies of text instances. The key idea of T‐Skeleton is to calibrate the coarse text proposals by embedding text instance skeletons to separate crowd texts accurately and robustly. We further design a channel attention module to enlarge the performance margin between T‐Skeleton and the segmentation baseline. Experimental results on four publicly available datasets show the superiority of T‐Skeleton in handling long and curved texts.

Funder

Natural Science Foundation of Xinjiang Uygur Autonomous Region

National Natural Science Foundation of China

Publisher

Institution of Engineering and Technology (IET)

Reference54 articles.

1. A survey on methods, datasets and implementations for scene text spotting

2. Text Recognition in the Wild

3. Tian Z. Huang W. He T. He P. Qiao Y.:Detecting text in natural image with connectionist text proposal network. In:14th European Conference on Computer Vision (ECCV) Amsterdam The Netherlands. pp.56–72(2016)

4. Shi B. Bai X. Belongie S.J.:Detecting oriented text in natural images by linking segments. In:Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu HI USA. pp.3482–3490(2017)

5. Deng J. Liu H.F. Li X.L. Cai D.:PixelLink: Detecting scene text via instance segmentation. In:AAAI Conference on Artificial Intelligence. New Orleans Louisiana USA. pp.6773–6780(2018)