Ocropodium: open source OCR for small-scale historical archives-Reference-Cited by-同舟云学术

Ocropodium: open source OCR for small-scale historical archives

Published:2011-11-21 Issue:1 Volume:38 Page:76-86
ISSN:0165-5515
Container-title:Journal of Information Science
language:en
Short-container-title:Journal of Information Science

Author:

Blanke Tobias¹,Bryant Michael¹,Hedges Mark¹

Affiliation:

1. King’s College London, UK

Abstract

Large-scale digitization projects dealing with text-based historical material face challenges that are not well catered for by commercial software. This article discusses the results of a project to build a scalable OCR workflow for historical collections based on open source tools that is particularly tailored towards use in small-scale historical archives. It argues that open source tools allow for better customization to match these requirements, particularly with regard to character model training and per-project language modelling. We offer insights into our accuracy evaluation results of various open source OCR tools, as well as a case study about the challenges and opportunities of open source OCR in historical archives.

Publisher

SAGE Publications

Subject

Library and Information Sciences,Information Systems

Link

http://journals.sagepub.com/doi/pdf/10.1177/0165551511429418

Reference13 articles.

1. How Good Can It Get?

2. A comprehensive evaluation methodology for noisy historical document recognition techniques

3. Inheritance and loss? A brief survey of Google Books

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A document image classification system fusing deep and machine learning models;Applied Intelligence;2022-11-15

2. Risky business? Addressing the challenges of historical methods in the ‘digital age’;Collegian;2020-12

3. Optimisation of archival processes involving digitisation of typewritten documents;Aslib Journal of Information Management;2020-07-16

4. Usability evaluation of an open-source environmental monitoring data dashboard for archivists;Archival Science;2020-04-18

5. The context and state of open source software adoption in US academic libraries;Library Hi Tech;2019-11-18