Unlocking digital archives: cross-disciplinary perspectives on AI and born-digital data-Reference-Cited by-同舟云学术

Unlocking digital archives: cross-disciplinary perspectives on AI and born-digital data

Published:2022-01-12 Issue:3 Volume:37 Page:823-835
ISSN:0951-5666
Container-title:AI & SOCIETY
language:en
Short-container-title:AI & Soc

Author:

Jaillant Lise^ORCID,Caputo Annalina^ORCID

Abstract

AbstractCo-authored by a Computer Scientist and a Digital Humanist, this article examines the challenges faced by cultural heritage institutions in the digital age, which have led to the closure of the vast majority of born-digital archival collections. It focuses particularly on cultural organizations such as libraries, museums and archives, used by historians, literary scholars and other Humanities scholars. Most born-digital records held by cultural organizations are inaccessible due to privacy, copyright, commercial and technical issues. Even when born-digital data are publicly available (as in the case of web archives), users often need to physically travel to repositories such as the British Library or the Bibliothèque Nationale de France to consult web pages. Provided with enough sample data from which to learn and train their models, AI, and more specifically machine learning algorithms, offer the opportunity to improve and ease the access to digital archives by learning to perform complex human tasks. These vary from providing intelligent support for searching the archives to automate tedious and time-consuming tasks. In this article, we focus on sensitivity review as a practical solution to unlock digital archives that would allow archival institutions to make non-sensitive information available. This promise to make archives more accessible does not come free of warnings for potential pitfalls and risks: inherent errors, "black box" approaches that make the algorithm inscrutable, and risks related to bias, fake, or partial information. Our central argument is that AI can deliver its promise to make digital archival collections more accessible, but it also creates new challenges - particularly in terms of ethics. In the conclusion, we insist on the importance of fairness, accountability and transparency in the process of making digital archives more accessible.

Funder

Dublin City University

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Human-Computer Interaction,Philosophy

Link

https://link.springer.com/content/pdf/10.1007/s00146-021-01367-x.pdf

Reference40 articles.

1. Alex B, Llewellyn C (2020) Library carpentry: text and data mining. Centre for Data, Culture and Society. University of Edinburgh. http://librarycarpentry.org/lc-tdm/. Accessed 3 May 2021

2. Ames S, Lewis S (2020) Disrupting the library: digital scholarship and Big Data at the National Library of Scotland. Big Data Soc 7:1–7. https://doi.org/10.1177/2053951720970576

3. Baron JR, Payne N (2017) Dark archives and E-democracy: strategies for overcoming access barriers to the public record archives of the future. Presented at the 2017 conference for E-democracy and open government (CeDEM), pp 3–11. https://doi.org/10.1109/CeDEM.2017.27

4. Bird S, Klein E, Loper E (2019) Natural language processing with python—analyzing text with the natural language toolkit, O'Reilly Media. https://www.nltk.org/book/. Accessed 3 May 2021

5. Bolukbasi T, Chang K-W, Zou J et al (2016) Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Proceedings of the 30th international conference on neural information processing systems. Curran Associates Inc., Red Hook, NY, USA, pp 4356–4364

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Digital curation practices on web and social media archiving in libraries and archives;Journal of Librarianship and Information Science;2024-07-26

2. Bi-Objective Negative Sampling for Sensitivity-Aware Search;Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval;2024-07-10

3. Postmortem Gone Astray—A Systematic Review and Meta-Analysis;Forensic Sciences;2024-06-05

4. Peut-on parler de l’automatisation comme cinquième paradigme archivistique?;The Canadian Journal of Information and Library Science;2024-05-18

5. Changes in Digital Collections and Their Metadata: A Longitudinal Study of UIUC Digital Library;Journal of Library Metadata;2024-04-11