ATLAS: an automated association test using probabilistically linked health records with application to genetic studies

Author:

Zhang Harrison G123ORCID,Hejblum Boris P45ORCID,Weber Griffin M1,Palmer Nathan P1,Churchill Susanne E1,Szolovits Peter6,Murphy Shawn N78,Liao Katherine P12,Kohane Isaac S1,Cai Tianxi14ORCID

Affiliation:

1. Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA

2. Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Boston, Massachusetts, USA

3. Department of Biological Sciences, Columbia University, New York City, New York, USA

4. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA

5. Bordeaux Population Health, Université de Bordeaux, Inserm U1219, Inria SISTM, Bordeaux, France

6. Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

7. Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA

8. Research IS and Computing, Mass General Brigham HealthCare, Charlestown, Massachusetts, USA

Abstract

Abstract Objective Large amounts of health data are becoming available for biomedical research. Synthesizing information across databases may capture more comprehensive pictures of patient health and enable novel research studies. When no gold standard mappings between patient records are available, researchers may probabilistically link records from separate databases and analyze the linked data. However, previous linked data inference methods are constrained to certain linkage settings and exhibit low power. Here, we present ATLAS, an automated, flexible, and robust association testing algorithm for probabilistically linked data. Materials and Methods Missing variables are imputed at various thresholds using a weighted average method that propagates uncertainty from probabilistic linkage. Next, estimated effect sizes are obtained using a generalized linear model. ATLAS then conducts the threshold combination test by optimally combining P values obtained from data imputed at varying thresholds using Fisher’s method and perturbation resampling. Results In simulations, ATLAS controls for type I error and exhibits high power compared to previous methods. In a real-world genetic association study, meta-analysis of ATLAS-enabled analyses on a linked cohort with analyses using an existing cohort yielded additional significant associations between rheumatoid arthritis genetic risk score and laboratory biomarkers. Discussion Weighted average imputation weathers false matches and increases contribution of true matches to mitigate linkage error-induced bias. The threshold combination test avoids arbitrarily choosing a threshold to rule a match, thus automating linked data-enabled analyses and preserving power. Conclusion ATLAS promises to enable novel and powerful research studies using linked data to capitalize on all available data sources.

Funder

US National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Reference43 articles.

1. A translational engine at the national scale: informatics for integrating biology and the bedside;Kohane;J Am Med Inform Assoc,2012

2. Translational bioinformatics: coming of age;Butte;J Am Med Inform Assoc,2008

3. A Bayesian procedure for file linking to analyze end-of-life medical costs;Gutman;J Am Stat Assoc,2013

4. The effect of mismatching on the measurement of response errors;Neter;J Am Stat Assoc,1965

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3