RA-YOLOv8: An Improved YOLOv8 Seal Text Detection Method-Reference-Cited by-同舟云学术

RA-YOLOv8: An Improved YOLOv8 Seal Text Detection Method

Published:2024-07-30 Issue:15 Volume:13 Page:3001
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Sun Han¹,Tan Chaohong²,Pang Si¹,Wang Hancheng²,Huang Baohua¹²^ORCID

Affiliation:

1. School of Computer and Electronic Information, Guangxi University, Nanning 530004, China

2. Guangxi Key Laboratory of Digital Infrastructure, Guangxi Zhuang Autonomous Region Information Center, Nanning 530000, China

Abstract

To detect text from electronic seals that have significant background interference, blurring, text overlapping, and curving, an improved YOLOv8 model named RA-YOLOv8 was developed. The model is primarily based on YOLOv8, with optimized structures in its backbone and neck. The receptive-field attention and efficient multi-scale attention (RFEMA) module is introduced in the backbone. The model’s ability to extract and integrate local and global features is enhanced by combining the attention on the receptive-field spatial feature of the receptive-field attention and coordinate attention (RFCA) module and the cross-spatial learning of the efficient multi-scale attention (EMA) module. The Alterable Kernel Convolution (AKConv) module is incorporated in the neck, enhancing the model’s detection accuracy of curved text by dynamically adjusting the sampling position. Furthermore, to boost the model’s detection performance, the original loss function is replaced with the bounding box regression loss function of Minimum Point Distance Intersection over Union (MPDIoU). Experimental results demonstrate that RA-YOLOv8 surpasses YOLOv8 in terms of precision, recall, and F1 value, with improvements of 0.4%, 1.6%, and 1.03%, respectively, validating its effectiveness and utility in seal text detection.

Funder

Open Project Program of Guangxi Key Laboratory of Digital Infrastructure

National Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/15/3001/pdf

Reference43 articles.

1. Tian, Z., Huang, W., He, T., He, P., and Qiao, Y. (2016, January 11–14). Detecting Text in Natural Image with Connectionist Text Proposal Network. Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands.

2. Shi, B., Bai, X., and Belongie, S. (2017, January 21–26). Detecting Oriented Text in Natural Images by Linking Segments. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

3. Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., and Liang, J. (2017, January 21–26). EAST: An Efficient and Accurate Scene Text Detector. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

4. Long, S., Ruan, J., Zhang, W., He, X., Wu, W., and Yao, C. (2018, January 8–14). TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. Proceedings of the Computer Vision—ECCV 2018, Munich, Germany.

5. Yang, Q., Cheng, M., Zhou, W., Chen, Y., Qiu, M., and Lin, W. (2018, January 13–19). Inceptext: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection. Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden.