Long Range PCR-based deep sequencing for haplotype determination in mixed HCMV infections

Author:

Brait NadjaORCID,Külekçi BüşraORCID,Goerzer IreneORCID

Abstract

AbstractShort read sequencing, which has extensively been used to decipher the genome diversity of human cytomegalovirus (HCMV) strains, often falls short to assess co-linearity of non-adjacent polymorphic sites in mixed HCMV populations. In the present study, we established a long amplicon sequencing workflow to identify number and relative quantities of unique HCMV haplotypes in mixtures. Accordingly, long read PacBio sequencing was applied to amplicons spanning over multiple polymorphic sites. Initial validation of this approach was performed with defined HCMV DNA templates derived from cell-free viruses and was further tested for its suitability on patient samples carrying mixed HCMV infections.Our data show that artificial HCMV DNA mixtures were correctly determined upon long amplicon sequencing down to 1% abundance of the minor DNA source. Total error rate of mapped reads ranged from 0.17 to 0.43 depending on the stringency of quality trimming. PCR products of up to 7.7 kb and a GC content <55% were efficiently generated when DNA was directly isolated from bronchoalveolar lavage samples, yet long range PCR may display a slightly lower sensitivity compared to short amplicons. In a single sample, up to three distinct haplotypes were identified showing varying relative frequencies. Intra-patient haplotype diversity is unevenly distributed across the target site and often interspersed by long identical stretches, thus unable to be linked by short reads. Moreover, diversity at single polymorphic regions as assessed by short amplicon sequencing may markedly underestimate the overall diversity of mixed populations.Quantitative haplotype determination by long amplicon sequencing provides a novel approach for HCMV strain characterisation in mixed infected samples which can be scaled up to cover the majority of the genome. This will substantially improve our understanding of intra-host HCMV strain diversity and its dynamic behaviour.Impact statementHuman cytomegalovirus (HCMV), a large enveloped DNA virus, displays the highest inter-host genome variability among all human herpesviruses. Primary infection, reinfection and reactivation are mostly asymptomatic but may cause devastating harm in congenitally infected newborns and in immunosuppressed individuals. Multiple distinct strains circulate in humans, each characterised by a unique assembly of well-defined polymorphic genes, most of which are linked to cell entry, persistence and immune evasion. Mixed HCMV strain infections are common and may pose a high pathogenic potential for patients at risk for symptomatic infections. To better understand the biological behaviour and dynamics of individual viral genomes it is inevitable to assess the co-linearity of polymorphic sites in a genetically heterogeneous population. In this study, we established and successfully applied a long read sequencing technique to long amplicons and identified co-linear genome stretches (haplotypes) in patient samples with mixed HCMV populations. This strategy for haplotype determination allows linkage analysis of multiple non-adjacent polymorphic sites along up to 7.7 kb. This allows a better approximation to the true strain diversity in mixed samples, which short read sequencing approaches failed to do. Thereby, improving our knowledge on mixed HCMV infections important for the clinical outcome, diagnostics, treatment and vaccine development.Data SummarySequence data generated in this study were deposited in GenBank with the accession numbers MW560357-MW560373. Raw data of Illumina and PacBio sequencing were submitted to the NCBI Sequence Read Archive (SRA) under project number SUB8972240. BioSample accession numbers are provided in Supplementary Table 3 and 4.Additional sequence data for reference purposes were accessed from GenBank. Accession numbers are listed in Supplementary Table 6 and 7.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3