The Archaeotools project: faceted classification and natural language processing in an archaeological context

Author:

Jeffrey S.1,Richards J.1,Ciravegna F.2,Waller S.1,Chapman S.2,Zhang Z.2

Affiliation:

1. Archaeology Data Service, Department of Archaeology, The King's Manor, University of YorkYork Y01 7EP, UK

2. Web Intelligence Technologies Laboratory, Natural Language Processing Group, Department of Computer Science, University of SheffieldSheffield S1 4DP, UK

Abstract

This paper describes ‘Archaeotools’, a major e-Science project in archaeology. The aim of the project is to use faceted classification and natural language processing to create an advanced infrastructure for archaeological research. The project aims to integrate over 1×10 6 structured database records referring to archaeological sites and monuments in the UK, with information extracted from semi-structured grey literature reports, and unstructured antiquarian journal accounts, in a single faceted browser interface. The project has illuminated the variable level of vocabulary control and standardization that currently exists within national and local monument inventories. Nonetheless, it has demonstrated that the relatively well-defined ontologies and thesauri that exist in archaeology mean that a high level of success can be achieved using information extraction techniques. This has great potential for unlocking and making accessible the information held in grey literature and antiquarian accounts, and has lessons for allied disciplines.

Publisher

The Royal Society

Subject

General Physics and Astronomy,General Engineering,General Mathematics

Reference19 articles.

1. Amrani A. Abajian V. Kodratoff Y. & Matte-Tailliez O. 2008 A chain of text-mining to extract information in archaeology. In Information and communication technologies: from theory to applications ICTTA 2008 3rd Int. Conf. pp. 1–5.

2. Appelt D. E. & Israel D. 1999 Introduction to information extraction technology. IJCAI-99 tutorial Stockholm. See http://www.ai.sri.com/∼appelt/ie-tutorial/IJCAI99.pdf.

3. Bridging the Two Cultures – Commercial Archaeology and the Study of Prehistoric Britain

4. Ciravegna F. Lanfrachi V. Moore P. Baghdev R. & Iria J. 2006 Automatically annotating jet engine event reports using information extraction. In Proc. Knowledge and Information Management: the Challenge of Through Life Support Seminar Institution of Mechanical Engineers London 26 September 2006 .

Cited by 33 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Text Mining Oral Histories in Historical Archaeology;International Journal of Historical Archaeology;2023-01-13

2. Information Extraction and Machine Learning for Archaeological Texts;Discourse and Argumentation in Archaeology: Conceptual and Computational Approaches;2023

3. NLP and Archaeology: A View from a Digital Archive;Discourse and Argumentation in Archaeology: Conceptual and Computational Approaches;2023

4. Can BERT Dig It? Named Entity Recognition for Information Retrieval in the Archaeology Domain;Journal on Computing and Cultural Heritage;2022-09-16

5. Same text, same discourse? Empirical validation of a discourse analysis methodology for cultural heritage;Digital Scholarship in the Humanities;2022-07-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3