A System for Phenotype Harmonization in the National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine (TOPMed) Program

Author:

Stilp Adrienne M,Emery Leslie S,Broome Jai G,Buth Erin J,Khan Alyna T,Laurie Cecelia A,Wang Fei Fei,Wong Quenna,Chen Dongquan,D’Augustine Catherine M,Heard-Costa Nancy L,Hohensee Chancellor R,Johnson William Craig,Juarez Lucia D,Liu Jingmin,Mutalik Karen M,Raffield Laura M,Wiggins Kerri L,de Vries Paul S,Kelly Tanika N,Kooperberg Charles,Natarajan Pradeep,Peloso Gina M,Peyser Patricia A,Reiner Alex P,Arnett Donna K,Aslibekyan Stella,Barnes Kathleen C,Bielak Lawrence F,Bis Joshua C,Cade Brian E,Chen Ming-Huei,Correa Adolfo,Cupples L Adrienne,de Andrade Mariza,Ellinor Patrick T,Fornage Myriam,Franceschini Nora,Gan Weiniu,Ganesh Santhi K,Graffelman Jan,Grove Megan L,Guo Xiuqing,Hawley Nicola L,Hsu Wan-Ling,Jackson Rebecca D,Jaquish Cashell E,Johnson Andrew D,Kardia Sharon L R,Kelly Shannon,Lee Jiwon,Mathias Rasika A,McGarvey Stephen T,Mitchell Braxton D,Montasser May E,Morrison Alanna C,North Kari E,Nouraie Seyed Mehdi,Oelsner Elizabeth C,Pankratz Nathan,Rich Stephen S,Rotter Jerome I,Smith Jennifer A,Taylor Kent D,Vasan Ramachandran S,Weeks Daniel E,Weiss Scott T,Wilson Carla G,Yanek Lisa R,Psaty Bruce M,Heckbert Susan R,Laurie Cathy C

Abstract

Abstract Genotype-phenotype association studies often combine phenotype data from multiple studies to increase statistical power. Harmonization of the data usually requires substantial effort due to heterogeneity in phenotype definitions, study design, data collection procedures, and data-set organization. Here we describe a centralized system for phenotype harmonization that includes input from phenotype domain and study experts, quality control, documentation, reproducible results, and data-sharing mechanisms. This system was developed for the National Heart, Lung, and Blood Institute’s Trans-Omics for Precision Medicine (TOPMed) program, which is generating genomic and other -omics data for more than 80 studies with extensive phenotype data. To date, 63 phenotypes have been harmonized across thousands of participants (recruited in 1948–2012) from up to 17 studies per phenotype. Here we discuss challenges in this undertaking and how they were addressed. The harmonized phenotype data and associated documentation have been submitted to National Institutes of Health data repositories for controlled access by the scientific community. We also provide materials to facilitate future harmonization efforts by the community, which include 1) the software code used to generate the 63 harmonized phenotypes, enabling others to reproduce, modify, or extend these harmonizations to additional studies, and 2) the results of labeling thousands of phenotype variables with controlled vocabulary terms.

Publisher

Oxford University Press (OUP)

Subject

Epidemiology

Reference21 articles.

1. The PhenX Toolkit: get the most from your measures;Hamilton;Am J Epidemiol,2011

2. Maelstrom Research guidelines for rigorous retrospective data harmonization;Fortier;Int J Epidemiol,2017

3. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed program;Taliun;Nature,2021

4. The Unified Medical Language System (UMLS): integrating biomedical terminology;Bodenreider;Nucleic Acids Res,2004

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3