Benchmarking splice variant prediction algorithms using massively parallel splicing assays-Reference-Cited by-同舟云学术

Benchmarking splice variant prediction algorithms using massively parallel splicing assays

Published:2023-05-07 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Smith Cathy^ORCID,Kitzman Jacob O.^ORCID

Abstract

AbstractBackgroundVariants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes.ResultsWe benchmarked eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compared experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms’ concordance with MPSA measurements, and with each other, was lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieved the best overall performance at distinguishing disruptive and neutral variants. Controlling for overall call rate genome-wide, SpliceAI and Pangolin also showed superior overall sensitivity for identifying SDVs. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues.ConclusionSpliceAI and Pangolin showed the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.

Publisher

Cold Spring Harbor Laboratory

Reference105 articles.

1. Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes

2. Hereditary cancer genes are highly susceptible to splicing mutations

3. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies

4. Loss of exon identity is a common mechanism of human inherited disease

5. Spectrum of splicing variants in disease genes and the ability of RNA analysis to reduce uncertainty in clinical interpretation

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comprehensive splicing analysis of the alternatively spliced CHEK2 exons 8 and 10 reveals three enhancer/silencer‐rich regions and 38 spliceogenic variants;The Journal of Pathology;2024-02-09

2. Computational prediction of human deep intronic variation;GigaScience;2022-12-28