Abstract
AbstractIn historical recto–verso manuscripts, very often the text written on the opposite page of the folio penetrates through the fiber of the paper, so that the texts on the two sides appear mixed. This is a very impairing damage that cannot be physically removed, and hinders both the work of philologists and palaeographers and the automatic analysis of linguistic contents. A procedure based on neural networks (NN) is proposed here to clean up the complex background of the manuscripts from this interference. We adopt a very simple shallow NN whose learning phase employs a training set generated from the data itself using a theoretical blending model that takes into account ink diffusion and saturation. By virtue of the parametric nature of the model, various levels of damage can be simulated in the training set, favoring a generalization capability of the NN. More explicitly, the network can be trained without the need for a large class of other similar manuscripts, but is still able, at least to some extent, to classify manuscripts with varying degrees of corruption. We compare the performance of this NN and other methods both qualitatively and quantitatively on a reference dataset and heavily damaged historical manuscripts.
Funder
Consiglio Nazionale Delle Ricerche
Publisher
Springer Science and Business Media LLC
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献