Genome-wide prediction of pathogenic gain- and loss-of-function variants from ensemble learning of a diverse feature set-Reference-Cited by-同舟云学术

Genome-wide prediction of pathogenic gain- and loss-of-function variants from ensemble learning of a diverse feature set

Published:2022-06-10 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Stein David^ORCID,Sevim Bayrak Çiğdem,Wu Yiming,Kars Meltem Ece,Stenson Peter D.,Cooper David N.,Schlessinger Avner^ORCID,Itan Yuval

Abstract

ABSTRACTGain-of-function (GOF) variants give rise to increased or novel protein functions whereas loss-of-function (LOF) variants lead to diminished protein function. GOF and LOF variants can result in markedly varying phenotypes, even when occurring in the same gene. However, experimental approaches for identifying GOF and LOF are generally slow and costly, whilst currently available computational methods have not been optimized to discriminate between GOF and LOF variants. We have developed LoGoFunc, an ensemble machine learning method for predicting pathogenic GOF, pathogenic LOF, and neutral genetic variants. LoGoFunc was trained on a broad range of gene-, protein-, and variant-level features describing diverse biological characteristics, as well as network features summarizing the protein-protein interactome and structural features calculated from AlphaFold2 protein models. We analyzed GOF, LOF, and neutral variants in terms of local protein structure and function, splicing disruption, and phenotypic associations, thereby revealing previously unreported relationships between various biological phenomena and variant functional outcomes. For example, GOF and LOF variants exhibit contrasting enrichments in protein structural and functional regions, whilst LOF variants are more likely to disrupt canonical splicing as indicated by splicing-related features employed by the model. Further, by performing phenome-wide association studies (PheWAS), we identified strong associations between relevant phenotypes and high-confidence predicted GOF and LOF variants. LoGoFunc outperforms other tools trained solely to predict pathogenicity or general variant impact for the identification of pathogenic GOF and LOF variants.

Publisher

Cold Spring Harbor Laboratory

Reference74 articles.

1. Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes

2. Inborn errors of human STAT1: allelic heterogeneity governs the diversity of immunological and infectious phenotypes

3. Insights into protein structure, stability and function from saturation mutagenesis

4. A general framework for estimating the relative pathogenicity of human genetic variants

5. A method and server for predicting damaging missense mutations

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A power-based sliding window approach to evaluate the clinical impact of rare genetic variants in the nucleotide sequence or the spatial position of the folded protein;Human Genetics and Genomics Advances;2024-07

2. Development of a human genetics-guided priority score for 19,365 genes and 399 drug indications;Nature Genetics;2024-01

3. Leveraging large-scale multi-omics to identify therapeutic targets from genome-wide association studies;2023-11-01

4. Targeting SLC transporters: small molecules as modulators and therapeutic opportunities;Trends in Biochemical Sciences;2023-09

5. A power-based sliding window approach to evaluate the clinical impact of rare genetic variants;2022-07-31