Classification of incunable glyphs and out-of-distribution detection with joint energy-based models
-
Published:2023-06-22
Issue:3
Volume:26
Page:223-240
-
ISSN:1433-2833
-
Container-title:International Journal on Document Analysis and Recognition (IJDAR)
-
language:en
-
Short-container-title:IJDAR
Author:
Kordon Florian,Weichselbaumer Nikolaus,Herz Randall,Mossman Stephen,Potten Edward,Seuret Mathias,Mayr Martin,Christlein Vincent
Abstract
AbstractOptical character recognition (OCR) has proved a powerful tool for the digital analysis of printed historical documents. However, its ability to localize and identify individual glyphs is challenged by the tremendous variety in historical type design, the physicality of the printing process, and the state of conservation. We propose to mitigate these problems by a downstream fine-tuning step that corrects for pathological and undesirable extraction results. We implement this idea by using a joint energy-based model which classifies individual glyphs and simultaneously prunes potential out-of-distribution (OOD) samples like rubrications, initials, or ligatures. During model training, we introduce specific margins in the energy spectrum that aid this separation and explore the glyph distribution’s typical set to stabilize the optimization procedure. We observe strong classification at 0.972 AUPRC across 42 lower- and uppercase glyph types on a challenging digital reproduction of Johannes Balbus’ Catholicon, matching the performance of purely discriminative methods. At the same time, we achieve OOD detection rates of 0.989 AUPRC and 0.946 AUPRC for OOD ‘clutter’ and ‘ligatures’ which substantially improves upon recently proposed OOD detection techniques. The proposed approach can be easily integrated into the postprocessing phase of current OCR to aid reproduction and shape analysis research.
Funder
Friedrich-Alexander-Universität Erlangen-Nürnberg
Publisher
Springer Science and Business Media LLC
Subject
Computer Science Applications,Computer Vision and Pattern Recognition,Software
Reference73 articles.
1. Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning algorithm for Boltzmann machines. Cogn. Sci. 9(1), 147–169 (1985) 2. Arbel, M., Zhou, L., Gretton, A.: Generalized energy based models. arXiv preprint arXiv:2003.05033 (2020) 3. Betancourt, M.: A conceptual introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434 (2017) 4. Bond-Taylor, S., Leach, A., Long, Y., Willcocks, C.G.: Deep generative modelling: a comparative review of VAEs, GANs, normalizing flows, energy-based and autoregressive models. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7327–7347 (2022). https://doi.org/10.1109/TPAMI.2021.3116668 5. Brosse, N., Moulines, E., Durmus, A.: The promises and pitfalls of stochastic gradient Langevin dynamics. In: Advances in Neural Information Processing Systems. NIPS’18, pp. 8278–8288. Curran Associates Inc., Montréal (2018)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|