Small polymorphisms are a source of ancestral bias in structural variant breakpoint placement-Reference-Cited by-同舟云学术

Small polymorphisms are a source of ancestral bias in structural variant breakpoint placement

Published:2024-01 Issue:1 Volume:34 Page:7-19
ISSN:1088-9051
Container-title:Genome Research
language:en
Short-container-title:Genome Res.

Author:

Audano Peter A.^ORCID,Beck Christine R.^ORCID

Abstract

High-quality genome assemblies and sophisticated algorithms have increased sensitivity for a wide range of variant types, and breakpoint accuracy for structural variants (SVs, ≥50 bp) has improved to near base pair precision. Despite these advances, many SV breakpoint locations are subject to systematic bias affecting variant representation. To understand why SV breakpoints are inconsistent across samples, we reanalyzed 64 phased haplotypes constructed from long-read assemblies released by the Human Genome Structural Variation Consortium (HGSVC). We identify 882 SV insertions and 180 SV deletions with variable breakpoints not anchored in tandem repeats (TRs) or segmental duplications (SDs). SVs called from aligned sequencing reads increase breakpoint disagreements by 2×–16×. Sequence accuracy had a minimal impact on breakpoints, but we observe a strong effect of ancestry. We confirm that SNP and indel polymorphisms are enriched at shifted breakpoints and are also absent from variant callsets. Breakpoint homology increases the likelihood of imprecise SV calls and the distance they are shifted, and tandem duplications are the most heavily affected SVs. Because graph genome methods normalize SV calls across samples, we investigated graphs generated by two different methods and find the resulting breakpoints are subject to other technical biases affecting breakpoint accuracy. The breakpoint inconsistencies we characterize affect ∼5% of the SVs called in a human genome and can impact variant interpretation and annotation. These limitations underscore a need for algorithm development to improve SV databases, mitigate the impact of ancestry on breakpoints, and increase the value of callsets for investigating breakpoint features.

Funder

National Institutes of Health (NIH) National Institute of General Medical Sciences

NIH National Cancer Institute

NIH National Human Genome Research Institute

Publisher

Cold Spring Harbor Laboratory

Reference104 articles.

1. A global reference for human genetic variation

2. Mapping and characterization of structural variation in 17,795 human genomes

3. Genome structural variation discovery and genotyping

4. Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Analysis and benchmarking of small and large genomic variants across tandem repeats;Nature Biotechnology;2024-04-26