Weighted combination of per-frame recognition results for text recognition in a video stream-Reference-Cited by-同舟云学术

Weighted combination of per-frame recognition results for text recognition in a video stream

Published:2021-02 Issue:1 Volume:45 Page:77-89
ISSN:2412-6179
Container-title:Computer Optics
language:ru
Short-container-title:

Author:

Petrova O.¹,Bulatov K.²,Arlazarov V.V.¹,Arlazarov V.L.²

Affiliation:

1. FRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, Russia

2. FRC CSC RAS, Moscow, Russia; Smart Engines Service LLC, Moscow, Russia; Moscow Institute of Physics and Technology (State University), Moscow, Russia

Abstract

The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.

Funder

Russian Foundation for Basic Research

Publisher

Samara State National Research University

Subject

Electrical and Electronic Engineering,Computer Science Applications,Atomic and Molecular Physics, and Optics

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The use of deep learning integrating image recognition in language analysis technology in secondary school education;Scientific Reports;2024-02-05

2. A Novel 6G Scalable Blockchain Clustering-Based Computer Vision Character Detection for Mobile Images;Computers, Materials & Continua;2024

3. Text localization and recognition of Chinese characters in natural scenes based on improved faster R-CNN;Journal of Intelligent & Fuzzy Systems;2023-11-04

4. Data Recognition for Multi-Source Heterogeneous Experimental Detection in Cloud Edge Collaboratives;International Journal of Information Technologies and Systems Approach;2023-09-26

5. Research on Quick Compliance Audit Technology for Information Operation and Maintenance Maintenance Process Scenarios Based on Artificial Intelligence Recognition;Proceedings of the 2023 3rd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum;2023-09-22