ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS-Reference-Cited by-同舟云学术

ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS

Published:2012-08 Issue:05 Volume:26 Page:1263002
ISSN:0218-0014
Container-title:International Journal of Pattern Recognition and Artificial Intelligence
language:en
Short-container-title:Int. J. Patt. Recogn. Artif. Intell.

Author:

LLADÓS JOSEP¹,RUSIÑOL MARÇAL¹,FORNÉS ALICIA¹,FERNÁNDEZ DAVID¹,DUTTA ANJAN¹

Affiliation:

1. Computer Vision Center, Computer Science Department, Edifici O, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain

Abstract

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.

Publisher

World Scientific Pub Co Pte Lt

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218001412630025

Reference23 articles.

1. Handwritten-Word Spotting Using Biologically Inspired Features

2. Handwritten Word Spotting in Old Manuscript Images Using a Pseudo-structural Descriptor Organized in a Hash Structure

3. A Novel Word Spotting Method Based on Recurrent Neural Networks

4. Localizing Objects with Smart Dictionaries

5. Bipartite weighted matching for on-line handwritten Chinese character recognition

Cited by 26 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Transformer-based Siamese Network For Word Image Retrieval In Historical Documents;2023 IEEE Smart World Congress (SWC);2023-08-28

2. A bibliometric analysis of off-line handwritten document analysis literature (1990–2020);Pattern Recognition;2022-05

3. Word Spotting as a Service: An Unsupervised and Segmentation-Free Framework for Handwritten Documents;Journal of Imaging;2021-12-17

4. Hierarchical graphs for coarse-to-fine error tolerant matching;Pattern Recognition Letters;2020-06

5. An adaptive document recognition system for lettrines;International Journal on Document Analysis and Recognition (IJDAR);2019-10-10