Understanding the heterogeneous performance of variant effect predictors across human protein-coding genes-Reference-Cited by-同舟云学术

Understanding the heterogeneous performance of variant effect predictors across human protein-coding genes

Published:2024-06-14 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Fawzy Mohamed,Marsh Joseph A.^ORCID

Abstract

AbstractVariant effect predictors (VEPs) are computational tools developed to assess the impacts of genetic mutations, often in terms of likely pathogenicity, employing diverse algorithms and training data. Here, we investigate the performance of 35 VEPs in the discrimination between pathogenic and putatively benign missense variants across 963 human protein-coding genes, revealing considerable gene-level heterogeneity as measured by the widely used area under the receiver operating characteristic curve (AUROC) metric. To investigate the origins of this heterogeneity and the extent to which gene-level VEP performance is predictable, we train random forest models to predict the gene-level AUROC for each VEP. We find that performance as measured by AUROC is related to factors such as gene function, protein structure, and evolutionary conservation. Notably, intrinsic disorder in proteins emerged as a significant factor influencing apparent VEP performance, often leading to inflated AUROC values due to their enrichment in weakly conserved putatively benign variants. While our results suggest that gene-level features may be useful for identifying genes where VEP predictions are likely to be more or less reliable, they also highlight the limitations of AUROC for comparing VEP performance across different genes.

Publisher

Cold Spring Harbor Laboratory

Reference52 articles.

1. Next-Generation Sequencing Platforms

2. Next-generation DNA sequencing

3. Overview of Next‐Generation Sequencing Technologies

4. Settling the score: variant prioritization and Mendelian disease

5. Variation Interpretation Predictors: Principles, Types, Performance, and Choice