The Value of Preexisting Structures for Digital Access: Modelling the Resolutions of the Dutch States General

Author:

Koolen Marijn1ORCID,Hoekstra Rik1ORCID,Oddens Joris1ORCID,Sluijter Ronald1ORCID,Van Koert Rutger2ORCID,Brouwer Gijsjan2ORCID,Brugman Hennie2ORCID

Affiliation:

1. Huygens Institute for the History of the Netherlands, Netherlands

2. KNAW Humanities Cluster - Department of Digital Infrastructure, Netherlands

Abstract

The Resolutions of the Dutch States General (1576–1796) is an archive covering over two centuries of decision making and consists of a heterogeneous series of handwritten and printed documents. The archive, which has recently been digitised, is a rich source for historical research. However, owing to the archive’s heterogeneity and dispersion of information, historians and other researchers find it hard to use the archive for their research. In this article, we describe how we deal with the challenges of structuring and connecting the information in this archive. We focus on identifying the existing structural elements, to turn the archive from a set of pages into a set of meeting dates and individual resolutions, with rich metadata for each resolution. To deal with the challenges of historical language change, spelling variation, and text recognition mistakes, we exploit the repetitive nature of the language of the resolutions and use fuzzy string searching to identify structural elements by the formulaic expressions that signal their boundaries. We also discuss and provide an analysis of the value of extracting different types of entities from the text and argue that the choice of which types of entities to focus on should be made based on how they support relevant research questions and methods. In the resolutions, we choose to prioritise person qualifications such as profession, legal status, or title, over person names. Qualifications allow users to select certain groups of people and to meaningfully combine with other layers of metadata, whereas person names lack contextual information to disambiguate them, making it unclear which and how many persons are referred to by selecting a specific person name. We show how our methodology results in a computational platform that allows users to explore and analyse the archive through many connected layers of metadata.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Computer Science Applications,Information Systems,Conservation

Reference47 articles.

1. The Getty end-user online searching project in the humanities; Report No. 6: Overview and conclusions;Bates Marcia J.;Coll. Res. Libraries,1996

2. Lars Borin, Dimitrios Kokkinakis, and Leif-Jöran Olsson. 2007. Naming the past: Named entity and animacy recognition in 19th Century Swedish literature. In Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH’07).1–8.

3. Emanuela Boros, Elvys Linhares Pontes, Luis Adrián Cabrera-Diego, Ahmed Hamdi, José Moreno, Nicolas Sidère, and Antoine Doucet. 2020. Robust named entity recognition and linking on historical multilingual documents. In Proceedings of the Conference and Labs of the Evaluation Forum (CLEF’20), Vol. 2696. CEUR-WS Working Notes, 1–17.

4. Alan Bryman. 2016. Social Research Methods. Oxford University Press, Oxford, UK.

5. Mining user queries with information extraction methods and linked data

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3