Affiliation:
1. Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed University), Pune, India
2. Gulbarga University, Kalaburagi, India
Abstract
In this article, the exhaustive experiment is carried out to test the performance of the Segmentation based Fractal Texture Analysis (SFTA) features with nt = 4 pairs, and nt = 8 pairs, geometric features and their combinations. A unified algorithm is designed to identify the scripts of the camera captured bi-lingual document image containing International language English with each one of Hindi, Kannada, Telugu, Malayalam, Bengali, Oriya, Punjabi, and Urdu scripts. The SFTA algorithm decomposes the input image into a set of binary images from which the fractal dimension of the resulting regions are computed in order to describe the segmented texture patterns. This motivates use of the SFTA features as the texture features to identify the scripts of the camera-based document image, which has an effect of non-homogeneous illumination (Resolution). An experiment is carried on eleven scripts each with 1000 sample images of block sizes 128 × 128, 256 × 256, 512 × 512 and 1024 × 1024. It is observed that the block size 512 × 512 gives the maximum accuracy of 86.45% for Gujarathi and English script combination and is the optimal size. The novelty of this article is that unified algorithm is developed for the script identification of bilingual document images.
Subject
Human-Computer Interaction,Information Systems
Reference22 articles.
1. Asad, F., Ul-Hasan, A., Shafait, F., & Dengel, A. (2016, April). High Performance OCR for Camera-Captured Blurred Documents with LSTM Networks. In 2016 12th IAPR Workshop on Document Analysis Systems (DAS) (pp. 7-12). IEEE.
2. Costa, A. F., Humpire-Mamani, G., & Traina, A. J. M. (2012, August). An efficient algorithm for fractal analysis of textures. In 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) (pp. 39-46). IEEE.
3. Dang, Q. B., Rusiñol, M., Coustaty, M., Luqman, M. M., Tran, C. D., & Ogier, J. M. (2016, April). Delaunay triangulation-based features for camera-based document image retrieval system. In 2016 12th IAPR Workshop on Document Analysis Systems (DAS) (pp. 1-6). IEEE.
4. Word-level script identification from scene images. In;O. K.Fasil;Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications,2017
5. A text detection technique applied in the framework of a mobile camera-based application.;S.Ferreira;Proceedings of the First International Workshop on Camera-based Document Analysis and Recognition (CBDAR),2005
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献