A probabilistic graphical model for estimating selection coefficient of missense variants from human population sequence data-Reference-Cited by-同舟云学术

A probabilistic graphical model for estimating selection coefficient of missense variants from human population sequence data

Published:2023-12-13 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Zhao Yige,Zhong Guojie,Hagen Jake,Pan Hongbing,Chung Wendy K.,Shen Yufeng^ORCID

Abstract

AbstractAccurately predicting the effect of missense variants is a central problem in interpretation of genomic variation. Commonly used computational methods does not capture the quantitative impact on fitness in populations. We developedMisFitto estimate missense fitness effect using biobank-scale human population genome data.MisFitjointly models the effect at molecular level (d) and population level (selection coefficient,s), assuming that in the same gene, missense variants with similardhave similars. MisFitis a probabilistic graphical model that integrates deep neural network components and population genetics models efficiently with inductive bias based on biological causality of variant effect. We trained it by maximizing probability of observed allele counts in 236,017 European individuals. We show thatsis informative in predicting frequency across ancestries and consistent with the fraction of de novo mutations givens. Finally,MisFitoutperforms previous methods in prioritizing missense variants in individuals with neurodevelopmental disorders.

Publisher

Cold Spring Harbor Laboratory

Reference73 articles.

1. Integrating de novo and inherited variants in 42,607 autism cases identifies mutations in new moderate-risk genes;Nature Genetics,2022

2. Contribution of rare inherited and de novo variants in 2,871 congenital heart disease probands

3. The contribution of de novo coding mutations to autism spectrum disorder

4. Analysis of protein-coding genetic variation in 60,706 humans

5. The “All of Us” Research Program