Building Software for Hierarchical Events in Biodiversity Informatics

Author:

Newman PeggyORCID,Martin David,Molina Javier

Abstract

In 2019, the Atlas of Living Australia (ALA) ran a national consultation, clarifying a long-held suspicion that while simple occurrence records provide invaluable discoverability and analysis for biodiversity data, the lack of contextual information on data collection methodology and protocols limits its usefulness for species abundance estimation and time-series analysis. The consultation recognised that the ALA has strong leadership in biodiversity standards and development, and that our 12-year history and investment in projects and engagement demonstrates a clear capacity to transition to a repository capable of capturing and aggregating the monitoring and survey data required for conservation efforts (Daly 2019). Around the same time, the larger data landscape was undergoing change in a similar direction, both internationally through the Global Biodiversity Information Facility’s (GBIF) Unified Model engagements, and nationally through the development of the Australian Biodiversity Information Standard (ABIS), an ontology for describing environmental data (Anonymous 2021). We embarked on a project to examine existing data standards and practices, extend our own occurrence model, and build software that could ingest event-based datasets and make them discoverable and interoperable. Initially we focused on well-structured surveys, both marine and terrestrial, to develop the system and user interface (UI). During the project, we restructured and modeled other exemplar datasets, collaborating with GBIF to develop event terms, vocabularies, and user interface components. Seeking interoperability with existing standards, we integrated concepts from both ABIS and the Ocean Biodiversity Information System’s (OBIS) ENV-DATA model (De Pooter et al. 2017) into a standardised yet flexible implementation of Event Core, navigable via a friendly user interface. The initial software release is comprised of an ingestion pipeline for events in parallel to occurrences, an index capable of handling nested data structures, and a user interface. The UI guides the user to explore and filter datasets; includes visualisations for data structures, taxonomic scope, repeat location surveys, extended measurements or facts; and links out to child occurrence records. Users can download filtered original and interpreted datasets with Digital Object Identifiers (DOI), in compressed files that comply simultaneously with Darwin Core Archive and Frictionless Data Package specifications. On release, we will present a range of datasets covering different event-based scenarios. The model has serendipitously provided the flexibility to encapsulate complex seed bank data. During the project, we developed a draft extension, which we used to service a new data portal for the Australian Seed Bank Partnership, a testament to the model’s serviceability for novel use cases. The ALA has taken innovative steps beyond simple collection of complex data types and worked with our local biodiversity informatics community to provide a navigable interface to this data. We intend to continue working with our own data providers and the international community, to realise the benefits of a more complex data model.

Publisher

Pensoft Publishers

Subject

General Engineering

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3