Advanced Applications on Bilingual Document Analysis and Processing Systems-Reference-Cited by-同舟云学术

Advanced Applications on Bilingual Document Analysis and Processing Systems

Published:2020-10 Issue:4 Volume:11 Page:149-193
ISSN:1947-8283
Container-title:International Journal of Applied Metaheuristic Computing
language:en
Short-container-title:

Author:

Puri Shalini¹^ORCID,Singh Satya Prakash²

Affiliation:

1. BIT, Mesra, Ranchi, India

2. BIT, Mesra, Ranchi, Jharkhand, India

Abstract

Today, rapid digitization requires efficient bilingual non-image and image document classification systems. Although many bilingual NLP and image-based systems provide solutions for real-world problems, they primarily focus on text extraction, identification, and recognition tasks with limited document types. This article discusses a journey of these systems and provides an overview of their methods, feature extraction techniques, document sets, classifiers, and accuracy for English-Hindi and other language pairs. The gaps found lead toward the idea of a generic and integrated bilingual English-Hindi document classification system, which classifies heterogeneous documents using a dual class feeder and two character corpora. Its non-image and image modules include pre- and post-processing stages and pre-and post-segmentation stages to classify documents into predefined classes. This article discusses many real-life applications on societal and commercial issues. The analytical results show important findings of existing and proposed systems.

Publisher

IGI Global

Subject

Decision Sciences (miscellaneous),Computational Mathematics,Computational Theory and Mathematics,Control and Optimization,Computer Science Applications,Modeling and Simulation,Statistics and Probability

Reference95 articles.

1. A Simple Study of Webpage Text Classification Algorithms for Arabic and English Languages

2. WordNet based Cross-Language Text Categorization

3. Improving statistical machine translation through co-joining parts of verbal constructs in English-Hindi translation.;K. K.Arora;Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation,2012

4. Applying Query Formulation and Fusion Techniques For Cross Language News Story Search

5. A survey on optical character recognition for Bangla and Devanagari scripts

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Image Classification with Information Extraction by Evaluating the Text Patterns in Bilingual Documents;Communications in Computer and Information Science;2023

2. Smart-farming assistance for agricultural crops in various seasons using web-enabled information service;LOW RADIOACTIVITY TECHNIQUES 2022 (LRT 2022): Proceedings of the 8th International Workshop on Low Radioactivity Techniques;2023

3. A Review of Ambiguous News Detection Approaches with Deep Learning, Machine Learning, and Ensemble Paradigms;2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT);2022-10-07

4. A Novel Approach to Ambiguous Fake News Classification through Machine Learning;2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT);2022-10-07

5. Social & Juristic challenges of AI for Opinion Mining Approaches on Amazon & Flipkart Product Reviews Using Machine Learning Algorithms;SN Computer Science;2021-03-30