Challenges for FAIR Digital Object Assessment

Author:

Gonzalez EstebanORCID,Garijo Daniel,Corcho Oscar

Abstract

A Digital Object (DO) "is a sequence of bits, incorporating a work or portion of a work or other information in which a party has rights or interests, or in which there is value". DOs should have persistent identifiers, meta-data and be readable by both humans and machines. A FAIR Digital Object is a DO able to interact with automated data processing systems (De Smedt et al. 2020) while following the FAIR (Findable, Accessible, Interoperable and Reusable principles) principles (Wilkinson et al. 2016). Although FAIR was originally targeted towards data artifacts, new initiatives have emerged to adapt other research digital resources such as software (Katz et al. 2021) (Lamprecht et al. 2020), ontologies (Poveda-Villalón et al. 2020), virtual research environments and even DOs (Collins et al. 2018). In this paper, we describe the challenges when assessing the level of compliance of a DO with the FAIR principles (i.e., its FAIRness), assuming that a DO contains multiple resources and captures their relationships. We explore different methods to calculate an evaluation score, and we discuss the challeneges and importance of providing explanations and guidelines for authors. FAIR assessment tools There are a growing number of tools used to assess the FAIRness of DOs. Community groups like FAIRassist.org have compiled lists of guidelines and tools for assessing the FAIRness of digital resources. These range from self assessment tools like questionnaires and checklists to semi-automated validators (Devaraju et al. 2021). Examples of automated validation tools include the F-UJI Automated FAIR Data Assessment Tool (Devaraju and Huber 2020), FAIR Evaluator and FAIR Checker for datasets or individual DOs; HowFairIs (Spaaks et al. 2021) for code repositories; and and FOOPS (Garijo et al. 2021) to assess ontologies. When it comes to assessing FDOs, we find two main challenges: Resource score discrepancy: Different FAIR assessment tools for the same type of resource produce different scores. For example, a recent study over datasets showcases differences in scores for the same resource due to how the FAIR principles are interpreted by different authors (Dumontier 2022). Heterogeneous FDO metadata: Validators include tests that explore metadata of the digital object. However, there is no agreed metadata schema to represent FDO metadata, which complicates this operation. In addition, metadata may be specific to a certain domain (De Smedt et al. 2020). To address this challenge, we need i) to agree on minimum common set of metadata to measure the FAIRness of DOs and ii) propose a framework to describe extensions for specialized digital objects (datasets, software, ontologies, VRE, etc.). Resource score discrepancy: Different FAIR assessment tools for the same type of resource produce different scores. For example, a recent study over datasets showcases differences in scores for the same resource due to how the FAIR principles are interpreted by different authors (Dumontier 2022). Heterogeneous FDO metadata: Validators include tests that explore metadata of the digital object. However, there is no agreed metadata schema to represent FDO metadata, which complicates this operation. In addition, metadata may be specific to a certain domain (De Smedt et al. 2020). To address this challenge, we need i) to agree on minimum common set of metadata to measure the FAIRness of DOs and ii) propose a framework to describe extensions for specialized digital objects (datasets, software, ontologies, VRE, etc.). In (Wilkinson et al. 2019), the authors propose a community-driven framework to assess the FAIRness of individual digital objects. This framework is based on: a collection of maturity indicators, principle compliance tests, and a module to apply those tests to digital resources. a collection of maturity indicators, principle compliance tests, and a module to apply those tests to digital resources. The proposed indicators may be a starting point to define which tests are needed for each type of resource (de Miranda Azevedo and Dumontier 2020). Aggregation of FAIR metrics Another challenge is the best way to produce an assessment score for a FDO, independently of the tests that are run to assess it. For example, each of the four dimensions of FAIR (Findable, Accessible, Interoperable and Reusable) usually have a different number of associated assessment tests. If the final score is presented based on the number of tests, then by default some dimensions may have more importance than others. Similarly, not all tests may have the same importance for some specific resources (e.g., in some cases having a license in a resource may be considered more important than having its full documentation). In our work we consider a FDO as an aggregation of resources, and therefore we face the additional challenge of creating an aggregated FAIRness score for the whole FDO. We consider the following aggregation scores: Global score: calculated by formula (see Fig. 1-1). It represents the percentage of total passed tests. It doesn’t take into account the principle to which a test belongs. FAIR average score: calculated by formula (see Fig. 1- 2). It represents the average of the passed tests ratios for each principle plus the ratio of passed tests used to evaluate the Research Object itself. Global score: calculated by formula (see Fig. 1-1). It represents the percentage of total passed tests. It doesn’t take into account the principle to which a test belongs. FAIR average score: calculated by formula (see Fig. 1- 2). It represents the average of the passed tests ratios for each principle plus the ratio of passed tests used to evaluate the Research Object itself. Both metrics are agnostic to the kind of resource analyzed. The score they produce ranges from [0 - 100]. Discussion A FDO has metadata records that describe it. Some records are common for all DOs, and others are specific to a DO. This makes it difficult to assess some FAIR principles like "F2: "data are described with rich metadata". Therefore, we believe a discussion of a minimal set of FAIR metadata should be addressed by the community. In addition, a FAIR assessment score can change significantly depending on the formula used for aggregating all metrics. Therefore, it is key to explain to users the method and provenance used to produce such score. Different communities should agree on the best scoring mechanism for their FDOs, e.g., by adding a weight to each principle and figuring out the right number of tests for each principle, which may give more importance to the principles with tests. We believe that the objective of a FAIR scoring system should not be to produce a ranking, but become a mechanism to improve the FAIRness of a FDO.

Publisher

Pensoft Publishers

Subject

General Medicine

Reference13 articles.

1. Turning FAIR into reality: Final report and action plan from the European Commission expert group on FAIR data.;Collins,2018

2. Considerations for the Conduction and Interpretation of FAIRness Evaluations

3. FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units

4. From conceptualization to implementation: FAIR assessment of research data objects;Devaraju;Data Science Journal,2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3