Affiliation:
1. Department of Computer Science & Engineering, PG Centre, Visvesvaraya Technological University, Belgaum-590018, Karnataka, India
2. Department of Computer Science & Engineering, Basaveshwar Engineering College, Bagalkot-587102, Karnataka, India
Abstract
Reliable extraction/segmentation of text lines, words and characters is one of the very important steps for development of automated systems for understanding the text in low resolution display board images. In this paper, a new approach for segmentation of text lines, words and characters from Kannada text in low resolution display board images is presented. The proposed method uses projection profile features and on pixel distribution statistics for segmentation of text lines. The method also detects text lines containing consonant modifiers and merges them with corresponding text lines, and efficiently separates overlapped text lines as well. The character extraction process computes character boundaries using vertical profile features for extracting character images from every text line. Further, the word segmentation process uses k-means clustering to group inter character gaps into character and word cluster spaces, which are used to compute thresholds for extracting words. The method also takes care of variations in character and word gaps. The proposed methodology is evaluated on a data set of 1008 low resolution images of display boards containing Kannada text captured from 2 mega pixel cameras on mobile phones at various sizes 240 × 320, 480 × 640 and 960 × 1280. The method achieves text line segmentation accuracy of 97.17%, word segmentation accuracy of 97.54% and character extraction accuracy of 99.09%. The proposed method is tolerant to font variability, spacing variations between characters and words, absence of free segmentation path due to consonant and vowel modifiers, noise and other degradations. The experimentation with images containing overlapped text lines has given promising results.
Publisher
World Scientific Pub Co Pte Lt
Subject
Computer Graphics and Computer-Aided Design,Computer Science Applications,Computer Vision and Pattern Recognition
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Resilient Kannada Scene Text Detection: CRAFT-YOLOv8 Fusion;2024 11th International Conference on Computing for Sustainable Global Development (INDIACom);2024-02-28
2. An efficient recognition system for preserving ancient historical documents of English characters;Journal of Ambient Intelligence and Humanized Computing;2020-06-15