Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences-Reference-Cited by-同舟云学术

Interpretably deep learning amyloid nucleation by massive experimental quantification of random sequences

Published:2024-07-17 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Thompson Mike^ORCID,Martín Mariano^ORCID,Olmo Trinidad Sanmartín^ORCID,Rajesh Chandana^ORCID,Koo Peter K.^ORCID,Bolognesi Benedetta^ORCID,Lehner Ben^ORCID

Abstract

AbstractInsoluble amyloid aggregates are the hallmarks of more than fifty human diseases, including the most common neurodegenerative disorders. The process by which soluble proteins nucleate to form amyloid fibrils is, however, quite poorly characterized. Relatively few sequences are known that form amyloids with high propensity and this data shortage likely limits our capacity to understand, predict, engineer, and prevent the formation of amyloid fibrils. Here we quantify the nucleation of amyloids at an unprecedented scale and use the data to train a deep learning model of amyloid nucleation. In total, we quantify the nucleation rates of >100,000 20-amino-acid-long peptides. This large and diverse dataset allows us to train CANYA, a convolution-attention hybrid neural network. CANYA is fast and outperforms existing methods with stable performance across diverse prediction tasks. Interpretability analyses reveal CANYA’s decision-making process and learned grammar, providing mechanistic insights into amyloid nucleation. Our results illustrate the power of massive experimental analysis of random sequence-spaces and provide an interpretable and robust neural network model to predict amyloid nucleation.

Publisher

Cold Spring Harbor Laboratory

Reference61 articles.

1. Protein Misfolding, Amyloid Formation, and Human Disease: A Summary of Progress Over the Last Decade

2. Functional amyloid – from bacteria to humans

3. Half a century of amyloids: past, present and future

4. Dobson, C. M. , Knowles, T. P. J. & Vendruscolo, M . The Amyloid Phenomenon and Its Significance in Biology and Medicine. Cold Spring Harb. Perspect. Biol. 12, (2020).

5. Molecular pathology of neurodegenerative diseases by cryo-EM of amyloids