RA-YOLOv8: An Improved YOLOv8 Seal Text Detection Method
-
Published:2024-07-30
Issue:15
Volume:13
Page:3001
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Sun Han1, Tan Chaohong2, Pang Si1, Wang Hancheng2, Huang Baohua12ORCID
Affiliation:
1. School of Computer and Electronic Information, Guangxi University, Nanning 530004, China 2. Guangxi Key Laboratory of Digital Infrastructure, Guangxi Zhuang Autonomous Region Information Center, Nanning 530000, China
Abstract
To detect text from electronic seals that have significant background interference, blurring, text overlapping, and curving, an improved YOLOv8 model named RA-YOLOv8 was developed. The model is primarily based on YOLOv8, with optimized structures in its backbone and neck. The receptive-field attention and efficient multi-scale attention (RFEMA) module is introduced in the backbone. The model’s ability to extract and integrate local and global features is enhanced by combining the attention on the receptive-field spatial feature of the receptive-field attention and coordinate attention (RFCA) module and the cross-spatial learning of the efficient multi-scale attention (EMA) module. The Alterable Kernel Convolution (AKConv) module is incorporated in the neck, enhancing the model’s detection accuracy of curved text by dynamically adjusting the sampling position. Furthermore, to boost the model’s detection performance, the original loss function is replaced with the bounding box regression loss function of Minimum Point Distance Intersection over Union (MPDIoU). Experimental results demonstrate that RA-YOLOv8 surpasses YOLOv8 in terms of precision, recall, and F1 value, with improvements of 0.4%, 1.6%, and 1.03%, respectively, validating its effectiveness and utility in seal text detection.
Funder
Open Project Program of Guangxi Key Laboratory of Digital Infrastructure National Natural Science Foundation of China
Reference43 articles.
1. Tian, Z., Huang, W., He, T., He, P., and Qiao, Y. (2016, January 11–14). Detecting Text in Natural Image with Connectionist Text Proposal Network. Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands. 2. Shi, B., Bai, X., and Belongie, S. (2017, January 21–26). Detecting Oriented Text in Natural Images by Linking Segments. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 3. Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., and Liang, J. (2017, January 21–26). EAST: An Efficient and Accurate Scene Text Detector. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 4. Long, S., Ruan, J., Zhang, W., He, X., Wu, W., and Yao, C. (2018, January 8–14). TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. Proceedings of the Computer Vision—ECCV 2018, Munich, Germany. 5. Yang, Q., Cheng, M., Zhou, W., Chen, Y., Qiu, M., and Lin, W. (2018, January 13–19). Inceptext: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection. Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden.
|
|