Author:
Linder Johannes,La Fleur Alyssa,Chen Zibo,Ljubetič Ajasja,Baker David,Kannan Sreeram,Seelig Georg
Abstract
AbstractSequence-based neural networks can learn to make accurate predictions from large biological datasets, but model interpretation remains challenging. Many existing feature attribution methods are optimized for continuous rather than discrete input patterns and assess individual feature importance in isolation, making them ill-suited for interpreting non-linear interactions in molecular sequences. Building on work in computer vision and natural language processing, we developed an approach based on deep generative modeling - Scrambler networks - wherein the most salient sequence positions are identified with learned input masks. Scramblers learn to generate Position-Specific Scoring Matrices (PSSMs) where unimportant nucleotides or residues are ‘scrambled’ by raising their entropy. We apply Scramblers to interpret the effects of genetic variants, uncover non-linear interactions between cis-regulatory elements, explain binding specificity for protein-protein interactions, and identify structural determinants of de novo designed proteins. We show that interpretation based on a generative model allows for efficient attribution across large datasets and results in high-quality explanations, often outperforming state-of-the-art methods.
Publisher
Cold Spring Harbor Laboratory
Reference62 articles.
1. The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design;Journal of chemical theory and computation,2017
2. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
3. Ancona, M. , Ceolini, E. , Ö ztireli, C. and Gross, M. , 2017. Towards better understanding of gradient-based attribution methods for deep neural networks (arXiv).
4. Anishchenko, I. , Chidyausiku, T.M. , Ovchinnikov, S. , Pellock, S.J. and Baker, D. , 2020. De novo protein design by deep network hallucination (bioRxiv).
5. Araujo, P.R. , Yoon, K. , Ko, D. , Smith, A.D. , Qiao, M. , Suresh, U. , Burns, S.C. and Penalva, L.O. , 2012. Before it gets started: regulating translation at the 5’ UTR. Comparative and functional genomics, 2012.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献