A Hybrid Scene Text Script Identification Network for Regional Indian Languages-Reference-Cited by-同舟云学术

A Hybrid Scene Text Script Identification Network for Regional Indian Languages

Published:2024-08-08 Issue:8 Volume:23 Page:1-26
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Naosekpam Veronica¹^ORCID,Sahu Nilkanta²^ORCID

Affiliation:

1. Artificial Intelligence Lab, Indian Institute of Information Technology Guwahati, Guwahati, India

2. Indian Institute of Information Technology Guwahati, Guwahati, India

Abstract

In this work, we introduce WAFFNet, an attention-centric feature fusion architecture tailored for word-level multi-lingual scene text script identification. Motivated by the limitations of traditional approaches that rely exclusively on feature-based methods or deep learning strategies, our approach amalgamates statistical and deep features to bridge the gap. At the core of WAFFNet, we utilized the merits of Local Binary Pattern—a prominent descriptor capturing low-level texture features with high-dimensional, semantically-rich convolutional features. This fusion is judiciously augmented by a spatial attention mechanism, ensuring targeted emphasis on semantically critical regions of the input image. To address the class imbalance problem in multi-class classification scenarios, we employed a weighted objective function. This not only regularizes the learning process but also addresses the class imbalance problem. The architectural integrity of WAFFNet is preserved through an end-to-end training paradigm, leveraging transfer learning to expedite convergence and optimize performance metrics. Considering the under-representation of regional Indian languages in current datasets, we meticulously curated IIITG-STLI2023, a comprehensive dataset encapsulating English alongside six under-represented Indian languages: Hindi, Kannada, Malayalam, Telugu, Bengali, and Manipuri. Rigorous evaluation of the IIITG-STLI2023, as well as the established MLe2e and SIW-13 datasets, underscores WAFFNet’s supremacy over both traditional feature-engineering approaches as well as state-of-the-art deep learning frameworks. Thus, the proposed WAFFNet framework offers a robust and effective solution for language identification in scene text images.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3649439

Reference47 articles.

1. Cursive script identification using Gabor features and SVM classifier;Aarif Kilvisharam Oziuddeen Mohammed;International Journal of Computer Aided Engineering and Technology,2020

2. Identification and classification of historical Kannada handwritten document images using LBP features;Bannigidad Parashuram;International Journal of Intelligent Systems Design and Computing,2018

3. Indic handwritten script identification using offline-online multi-modal deep network

4. Application of daisy descriptor for language identification in the wild

5. Amitava Choudhury, Hukam Singh Rana, and Tanmay Bhowmik. 2018. Handwritten bengali numeral recognition using hog based feature extraction algorithm. In Proceedings of the 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN’18). IEEE, 687–690.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Video text rediscovery: Predicting and tracking text across complex scenes;Computational Intelligence;2024-06