Wikidata and the biodiversity knowledge graph

Author:

Page RodericORCID

Abstract

This talk explores the role Wikidata (Vrandečić and Krötzsch 2014) might play in the task of assembling biodiversity information into a single, richly annotated and cross linked structure known as the biodiversity knowledge graph (Page 2016). Initially conceived as a language-independent data store of facts derived from the Wikipedia, Wikidata has morphed into a global knowledge graph, complete with a user friendly interface for data entry and a powerful implementation of the SPARQL query language. Wikidata already underpins projects such as Gene Wiki (Burgstaller-Muehlbacher et al. 2016) and Scholia (Nielsen et al. 2017). Much of the content of Wikispecies is being automatically added to Wikidata, hence many of the entities relevant to biodiversity (such as taxa, taxonomic publications, and taxonomists) well represented in Wikidata, making it even more attractive. Much of the data relevant to biodiversity is widely scattered in different locations, requiring considerable manual effort to collect and curate. Appeals to the taxonomic community to undertake these tasks have not always met with success. For example, the Global Registry of Biodiversity Repositories (GrBio) was an attempt to create a global list of biodiversity repositories, such as natural history museums and herbaria. An appeal by Schindel et al. (2016) for the taxonomic community to curate this list largely fell on deaf ears, and at the time of writing the GrBio project is moribund. Given that many repositories are housed in institutions that are the subject of articles in Wikipedia, many of these repositories already have entries in Wikidata. Hence, rather than follow the route GrBio took of building a resource and then hoping a community will assemble around that resource, we could go to Wikidata where there is an existing community and build the resource there. An impressive example of the potential for this is WikiCite, which initially had the goal of including in Wikidata every article cited in any of the Wikipedias. Taxonomic articles are highly cited in Wikipedia (Nielsen 2007), hence already fall within the remit of WikiCite. Hence Wikidata is a candidate for the “bibliography of life” (King et al. 2011), a database of all taxonomic literature. Another important role Wikidata can play is to define the boundaries of a biodiversity knowledge graph. Entities such as journals, articles, people, museums, and herbaria are often already in Wikidata, hence we can delegate managing that content to the Wikidata community (bolstered by our own contributions), and focus instead on domain-specific entities such as DNA sequences, specimens, etc., or domain specific attributes of those entities if they are already in Wikidata. This means we can avoid the inevitable “mission creep” that bedevils any attempt to link together information from multiple disciplines. These ideas are explored using examples based on content entirely within Wikidata (including entities such as publications, authorship, and natural history collections), as well as approaches that combine Wikidata with external knowledge graphs such as Ozymandias (Page 2018).

Publisher

Pensoft Publishers

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Let the Records Show: Attribution of Scientific Credit in Natural History Collections;International Journal of Plant Sciences;2023-06-01

2. #RetroPIDs: The missing link to the foundation of biodiversity knowledge;Biodiversity Information Science and Standards;2021-09-08

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3