Author:
Sun Weiwei,Wang Huiqian,Lu Yi,Luo Jiasai,Liu Ting,Lin Jinzhao,Pang Yu,Zhang Guo
Abstract
With the advent of smart cities, the text information in an image can be accurately located and recognized, and then applied to the fields of instant translation, image retrieval, card surface information recognition, and license plate recognition. Thus, people’s lives and work will become more convenient and comfortable. Owing to the varied orientations, angles, and shapes of text, identifying textual features from images is challenging. Therefore, we propose an improved EAST detector algorithm for detecting and recognizing slanted text in images. The proposed algorithm uses reinforcement learning to train a recurrent neural network controller. The optimal fully convolutional neural network structure is selected, and multi-scale features of text are extracted. After importing this information into the output module, the Generalized Intersection over Union algorithm is used to enhance the regression effect of the text bounding box. Next, the loss function is adjusted to ensure a balance between positive and negative sample classes before outputting the improved text detection results. Experimental results indicate that the proposed algorithm can address the problem of category homogenization and improve the low recall rate in target detection. When compared with other image detection algorithms, the proposed algorithm can better identify slanted text in natural scene images. Finally, its ability to recognize text in complex environments is also excellent.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference53 articles.
1. Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition;Fang;Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2021
2. Research on image text recognition based on canny edge detection algorithm and k-means algorithm
3. Real-Time Vision for Human-Computer Interaction;Kisacanin,2015
4. Vision-based Target Geo-location using a Fixed-wing Miniature Air Vehicle
5. Scene text extraction and translation for handheld devices;Haritaoglu;Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),2001
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献