Affiliation:
1. School of Information Science and Engineering, Lanzhou University, 222 S. Tianshui Rd., Lanzhou 730000, China
Abstract
Color image binarization plays a pivotal role in image preprocessing work and significantly impacts subsequent tasks, particularly for text recognition. This paper concentrates on document image binarization (DIB), which aims to separate an image into a foreground (text) and background (non-text content). We thoroughly analyze conventional and deep-learning-based approaches and conclude that prevailing DIB methods leverage deep learning technology. Furthermore, we explore the receptive fields of pre- and post-network training to underscore the Transformer model’s advantages. Subsequently, we introduce a lightweight model based on the U-Net structure and enhanced with the MobileViT module to capture global information features in document images better. Given its adeptness at learning both local and global features, our proposed model demonstrates competitive performance on two standard datasets (DIBCO2012 and DIBCO2017) and good robustness on the DIBCO2019 dataset. Notably, our proposed method presents a straightforward end-to-end model devoid of additional image preprocessing or post-processing, eschewing the use of ensemble models. Moreover, its parameter count is less than one-eighth of the model, which achieves the best results on most DIBCO datasets. Finally, two sets of ablation experiments are conducted to verify the effectiveness of the proposed binarization model.
Reference88 articles.
1. Pan, Y.F., Hou, X., and Liu, C.L. (2009, January 26–29). Text Localization in Natural Scene Images Based on Conditional Random Field. Proceedings of the 2009 10th International Conference on Document Analysis and Recognition, Barcelona, Spain.
2. OCR binarization and image pre-processing for searching historical documents;Gupta;Pattern Recognit.,2007
3. Text line extraction for historical document images;Saabni;Pattern Recognit. Lett.,2014
4. Junction detection in handwritten documents and its application to writer identification;He;Pattern Recognit.,2015
5. A survey of document image word spotting techniques;Giotis;Pattern Recognit.,2017