Blockchain-enabled immutable, distributed, and highly available clinical research activity logging system for federated COVID-19 data analysis from multiple institutions
Author:
Kuo Tsung-Ting1ORCID, Pham Anh1, Edelson Maxim E2, Kim Jihoon1ORCID, Chan Jason3ORCID, Gupta Yash4, Ohno-Machado Lucila156, Anderson David M, Balacha Chandrasekar, Bath Tyler, Baxter Sally L, Becker-Pennrich Andrea, Bell Douglas S, Bernstam Elmer V, Ngan Chau, Day Michele E, Doctor Jason N, DuVall Scott, El-Kareh Robert, Florian Renato, Follett Robert W, Geisler Benjamin P, Ghigi Alessandro, Gottlieb Assaf, Hinske Ludwig C, Hu Zhaoxian, Ir Diana, Jiang Xiaoqian, Kim Katherine K, Kim Jihoon, Knight Tara K, Koola Jejo D, Kuo Tsung-Ting, Lee Nelson, Mansmann Ulrich, Matheny Michael E, Meeker Daniella, Mou Zongyang, Neumann Larissa, Nguyen Nghia H, Nick Anderson, Ohno-Machado Lucila, Park Eunice, Paul Paulina, Pletcher Mark J, Post Kai W, Rieder Clemens, Scherer Clemens, Schilling Lisa M, Soares Andrey, SooHoo Spencer, Soysal Ekin, Steven Covington, Tep Brian, Toy Brian, Wang Baocheng, Wu Zhen R, Xu Hua, Yong Choi, Zheng Kai, Zhou Yujia, Zucker Rachel A,
Affiliation:
1. UCSD Health Department of Biomedical Informatics, University of California San Diego , La Jolla, California, USA 2. Department of Computer Science and Engineering, University of California San Diego , La Jolla, California, USA 3. Poway High School , Poway, California, USA 4. Canyon Crest Academy , San Diego, California, USA 5. Division of Health Services Research & Development, VA San Diego Healthcare System , San Diego, California, USA 6. Biomedical Informatics and Data Science, Yale School of Medicine , New Haven, Connecticut, USA
Abstract
Abstract
Objective
We aimed to develop a distributed, immutable, and highly available cross-cloud blockchain system to facilitate federated data analysis activities among multiple institutions.
Materials and Methods
We preprocessed 9166 COVID-19 Structured Query Language (SQL) code, summary statistics, and user activity logs, from the GitHub repository of the Reliable Response Data Discovery for COVID-19 (R2D2) Consortium. The repository collected local summary statistics from participating institutions and aggregated the global result to a COVID-19-related clinical query, previously posted by clinicians on a website. We developed both on-chain and off-chain components to store/query these activity logs and their associated queries/results on a blockchain for immutability, transparency, and high availability of research communication. We measured run-time efficiency of contract deployment, network transactions, and confirmed the accuracy of recorded logs compared to a centralized baseline solution.
Results
The smart contract deployment took 4.5 s on an average. The time to record an activity log on blockchain was slightly over 2 s, versus 5–9 s for baseline. For querying, each query took on an average less than 0.4 s on blockchain, versus around 2.1 s for baseline.
Discussion
The low deployment, recording, and querying times confirm the feasibility of our cross-cloud, blockchain-based federated data analysis system. We have yet to evaluate the system on a larger network with multiple nodes per cloud, to consider how to accommodate a surge in activities, and to investigate methods to lower querying time as the blockchain grows.
Conclusion
Blockchain technology can be used to support federated data analysis among multiple institutions.
Funder
National Institutes of Health UCSD Academic Senate Research Graduate Division San Diego Matching Fellowship San Diego Biomedical Informatics Education & Research National Library of Medicine
Publisher
Oxford University Press (OUP)
Subject
Health Informatics
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|