Affiliation:
1. National Institute of Technology, Jalandhar, Punjab, India
Abstract
Script identification from complex and colorful images is an integral part of the text recognition and classification system. Such images may contain twofold challenges: (1) Challenges related to the camera like blurring effect, non-uniform illumination and noisy background, and so on, and (2) Challenges related to the text shape, orientation, and text size. The present work in this area is much focused on non-Indian scripts. In contrast, Gurumukhi, Hindi, and English scripts play a vital role in communication among Indians and foreigners. In this article, we focus on the above said challenges in the field of identifying the script. Additionally, we have introduced a new dataset that contains Hindi, Gurumukhi, and English scripts from scenic images collected from different sources. We also proposed a CNN-based model, which is capable of distinguishing between the scripts with good accuracy. Performance of the method has been evaluated for own dataset, i.e., NITJDATASET and other benchmarked datasets available for Indian scripts, i.e., CVSI-2015 (Task-1 and Task 4) and ILST. This work is an extension to find the script from strict text background.
Publisher
Association for Computing Machinery (ACM)
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献