Affiliation:
1. Rensselaer Polytechnic Institute, USA
Abstract
We present a trustworthy mechanism for sharing, reusing, and repurposing data to address the challenge of the costly and time-consuming effort needed to bring an innovative idea from the bench (basic research) to the bedside (clinical level). Even though researchers may generate a solution on their own, other aspects of research, including peer review and dissemination of data/results, have an inherent social component. Compared with the centralized mechanisms of data-sharing (and the subsequent reuse and repurposing), many, if not all, aspects of these processes can be decentralized by using blockchain (for full decentralized and autonomous control), coupled with provenance (to ascertain how and where the resources have been leveraged) and incentive semantics (for characterizing how researchers would be rewarded for their contributions). By capturing metadata details at each step of the workflow, data will be easier to audit, verify, and merge with related datasets. It is common in settings where data is either sensitive or valuable (or both) to have formal data use agreements or sometimes less formal rules for reuse, which we have captured in smart contracts. A key innovative aspect of this work is the departure from the traditional natural language–based data use agreements to make these agreements more computable, resulting in enhanced usability and interoperability by a broader community. We have developed the Data Sharing Ontology, a structured vocabulary to guide various incentive mechanisms and criteria used in the decentralized protocol we introduced with smart contracts. Our solution can track data reuse, provide peer reviews on accountable data reuse, and report any violations, thus providing metrics for measuring data producers’ impact on reward structures and research measures. We introduce the
SCIENCE-index
designed to incentivize data-sharing in scientific research, which builds upon prior indices used in academic research, such as the h-index and the data-index. The
SCIENCE-index
is publicly available and automatically calculated by a smart contract based on an individual’s data sharing, reuse, and responsible stewardship activities. By incentivizing fair and honest data-related activities, the
SCIENCE-index
can help improve the speed, cost, and quality of scientific research. As an example application of this decentralized data-sharing framework, we demonstrate how this approach could radically improve the quality and the efficiency of scientific output in the setting of COVID-19 research data-sharing from the National COVID Cohort Collaborative (N3C).
Funder
Algorand Centres of Excellence program managed by the Algorand Foundation
Publisher
Association for Computing Machinery (ACM)
Reference81 articles.
1. Alice Meadows. 2014. To Share or not to Share? That is the (Research Data) Question... Retrieved Jan 31, 2021 from https://scholarlykitchen.sspnet.org/2014/11/11/to-share-or-not-to-share-that-is-the-research-data-question
2. Charles Arthur. 2010. Businesses unwilling to share data but keen on government doing it. https://www.theguardian.com/technology/2010/jun/29/business-data-sharing-unwilling
3. Effective Choice in the Prisoner's Dilemma
4. Cryptocurrency Scams: Analysis and Perspectives
5. Publication bias: A problem in interpreting medical data;Begg Colin B.;Journal of the Royal Statistical Society: Series A (Statistics in Society),1988