High Quality Phasing Using Linked-Read Whole Genome Sequencing of Patient Cohorts Informs Genetic Understanding of Complex Traits

Author:

Mastromatteo ScottORCID,Chen Angela,Gong Jiafen,Lin Fan,Thiruvahindrapuram Bhooma,Sung Wilson WL,Whitney Joe,Wang Zhuozhi,Patel Rohan V,Keenan Katherine,Halevy Anat,Panjwani Naim,Avolio Julie,Wang Cheng,Côté-Maurais Guillaume,Bégin Stéphanie,Adam Damien,Brochiero Emmanuelle,Bjornson Candice,Chilvers MarkORCID,Price April,Parkins Michael,van Wylick Richard,Mateos-Corral Dimas,Hughes Daniel,Smith Mary JaneORCID,Morrison Nancy,Tullis Elizabeth,Stephenson Anne L,Wilcox Pearce,Quon Bradley S,Leung Winnie M,Solomon Melinda,Sun Lei,Ratjen Felix,Strug Lisa JORCID

Abstract

AbstractPhasing of heterozygous alleles is critical for interpretation of cis-effects of disease-relevant variation. For population studies, phase is often inferred from external data but read-based phasing approaches that span long genomic distances would be more accurate because they enable both genotype and phase to be obtained from a single dataset. To demonstrate how read-based phasing can provide functional insights, we sequenced 477 individuals with Cystic Fibrosis (CF) using linked-read sequencing. We benchmark read-based phasing with different short- and long-read sequencing technologies, prioritize linked-read technology as the most informative and produce a benchmark phase call set from reference sample HG002 for the community. The 477 samples display an average phase block N50 of 4.39 Mb. We use these samples to construct a graph representation of CFTR haplotypes, which facilitates understanding of complex CF alleles. Fine-mapping and phasing of the chr7q35 trypsinogen locus associated with CF meconium ileus demonstrates a 20 kb deletion and a PRSS2 missense variant p.Thr8Ile (rs62473563) independently contribute to meconium ileus risk (p=0.0028, p=0.011, respectively) and are PRSS2 pancreas eQTLs (p=9.5e-7 and p=1.4e-4, respectively), explaining the mechanism by which these polymorphisms contribute to CF. Phase enables access to haplotypes that can be used for genome graph or reference panel construction, identification of cis-effects, and for understanding disease associated loci. The phase information from linked-reads provides a causal explanation for variation at a CF-relevant locus which also has implications for the genetic basis of non-CF pancreatitis to which this locus has been reported to contribute.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3