Interpretable pairwise distillations for generative protein sequence models-Reference-Cited by-同舟云学术

Interpretable pairwise distillations for generative protein sequence models

Published:2022-06-23 Issue:6 Volume:18 Page:e1010219
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Feinauer Christoph^ORCID,Meynard-Piganeau Barthelemy^ORCID,Lucibello Carlo^ORCID

Abstract

Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial higher-order interactions from the data. In this work, we analyze two different NN models and assess how close they are to simple pairwise distributions, which have been used in the past for similar problems. We present an approach for extracting pairwise models from more complex ones using an energy-based modeling framework. We show that for the tested models the extracted pairwise models can replicate the energies of the original models and are also close in performance in tasks like mutational effect prediction. In addition, we show that even simpler, factorized models often come close in performance to the original models.

Publisher

Public Library of Science (PLoS)

Subject

Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics

Reference37 articles.

1. Learning generative models for protein fold families;S Balakrishnan;Proteins: Structure, Function, and Bioinformatics,2011

2. Feinauer C, Weigt M. Context-aware prediction of pathogenicity of missense mutations involved in human disease. arXiv preprint arXiv:170107246. 2017;.

3. Mutation effects predicted from sequence co-variation;TA Hopf;Nature biotechnology,2017

4. An evolution-based model for designing chorismate mutase enzymes;WP Russ;Science,2020

5. Deciphering protein evolution and fitness landscapes with latent space models;X Ding;Nature communications,2019

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Symmetry, gauge freedoms, and the interpretability of sequence-function relationships;2024-05-13

2. Gauge fixing for sequence-function relationships;2024-05-13

3. Mean Dimension of Generative Models for Protein Sequences;2022-12-14