Linking The Cancer Imaging Archive and GenBank to the National Clinical Cohort Collaborative

Author:

Baghal Ahmad1ORCID,Saltz Joel2,Kurc Tahsin2,Prasanna Prateek2,Baghal Samantha3,Hajagos Janos2,Bremer Erich2,Al‐Shukri Shaymaa1,Kennedy Joshua L.4,Rutherford Michael1,Nolan Tracy1ORCID,Smith Kirk1,Chute Christopher G.5,Prior Fred1

Affiliation:

1. Department of Biomedical Informatics University of Arkansas for Medical Sciences, College of Medicine Little Rock Arkansas USA

2. Stony Brook University The State University of New York, Biomedical Informatics Stony Brook New York USA

3. Department of Internal Medicine The University of Tennessee Health Science Center Memphis Tennessee USA

4. Department of Pediatrics and Internal Medicine University of Arkansas for Medical Sciences, College of Medicine, Arkansas Children's Research Institute Little Rock Arkansas USA

5. Johns Hopkins University Baltimore Maryland USA

Abstract

AbstractObjectiveThis project demonstrates the feasibility of connecting medical imaging data and features, SARS‐CoV‐2 genome variants, with clinical data in the National Clinical Cohort Collaborative (N3C) repository to accelerate integrative research on detection, diagnosis, and treatment of COVID‐19‐related morbidities. The N3C curated a rich collection of aggregated and de‐identified electronic health records (EHR) data of over 18 million patients, including 7.5 million COVID‐positive patients, seen at hospitals across the United States. Medical imaging data and variant samples are important data modalities used in the study of COVID‐19.Materials and MethodsImaging data and features are hosted on the Cancer Imaging Archive (TCIA), and sequenced variant samples are analyzed and stored at the NIH GenBank. The University of Arkansas for Medical Sciences (UAMS) published the first COVID‐19 data set of 105 patients on TCIA and 37 patients on GenBank. We developed a process to link imaging and genomic variants and N3C EHR data through Privacy Preserving Record Linkage (PPRL) using de‐identified cryptographic hashes to match records associated with the same individuals without using patient identifiers.ResultsThe PPRL techniques were piloted using clinical and imaging data sets provided by UAMS. Developed software components and processes executed properly, and linked data were returned and processed for visualization.ConclusionLinking across clinical data sources at the patient level provides opportunities to gain insights from data that may not be known otherwise. The PPRL prototype and the pilot serve as a model to link disparate and diverse data repositories to enhance clinical research.

Publisher

Wiley

Reference21 articles.

1. The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment

2. Secondary use of Patients' electronic records (SUPER): an approach for meeting specific data needs of clinical and translational researchers;Sholle ET;Ann Symp Proc. AMIA Symp,2017

3. Coronavirus Disease 2019 (COVID-19): Role of Chest CT in Diagnosis and Management

4. Performance of Radiologists in Differentiating COVID-19 from Non-COVID-19 Viral Pneumonia at Chest CT

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3