Multi-Modal CLIP-Informed Protein Editing
Author:
Yin Mingze,Zhou Hanjing,Zhu Yiheng,Lin Miao,Wu Yixuan,Wu Jialu,Xu Hongxia,Hsieh Chang-Yu,Hou Tingjun,Chen Jintai,Wu Jian
Abstract
AbstractProteins govern most biological functions essential for life, but achieving controllable protein discovery and optimization remains challenging. Recently, machine learning-assisted protein editing (MLPE) has shown promise in accelerating optimization cycles and reducing experimental workloads. However, current methods struggle with the vast combinatorial space of potential protein edits and cannot explicitly conduct protein editing using biotext instructions, limiting their interactivity with human feedback. To fill these gaps, we propose a novel method called ProtET for efficient CLIP-informed protein editing through multi-modality learning. Our approach comprises two stages: in the pretraining stage, contrastive learning aligns protein-biotext representations encoded by two large language models (LLMs), respectively. Subsequently, during the protein editing stage, the fused features from editing instruction texts and original protein sequences serve as the final editing condition for generating target protein sequences. Comprehensive experiments demonstrated the superiority of ProtET in editing proteins to enhance human-expected functionality across multiple attribute domains, including enzyme catalytic activity, protein stability and antibody specific binding ability. And ProtET improves the state-of-the-art results by a large margin, leading to significant stability improvements of 16.67% and 16.90%. This capability positions ProtET to advance real-world artificial protein editing, potentially addressing unmet academic, industrial, and clinical needs.
Publisher
Cold Spring Harbor Laboratory
Reference50 articles.
1. Zeyuan Wang , Qiang Zhang , Haoran Yu , Shuangwei Hu , Xurui Jin , Zhichen Gong , and Huajun Chen . Multi-level protein structure pre-training with prompt learning. In International Conference on Learning Representations, 2023. 2. Selective chemical protein modification;Nature Communications,2014 3. Stereoretentive post-translational protein editing;ACS Central Science,2023 4. Richard Evans , Michael O’Neill , Alexander Pritzel , Natasha Antropova , Andrew Senior , Tim Green , Augustin Žídek , Russ Bates , Sam Blackwell , Jason Yim , Olaf Ronneberger , Sebastian Bodenstein , Michal Zielinski , Alex Bridgland , Anna Potapenko , Andrew Cowie , Kathryn Tunyasuvunakool , Rishub Jain , Ellen Clancy , Pushmeet Kohli , John Jumper , and Demis Hassabis . Protein complex prediction with AlphaFold-Multimer. bioRxiv, 2021. 5. Josh Abramson , Jonas Adler , Jack Dunger , Richard Evans , Tim Green , Alexander Pritzel , Olaf Ronneberger , Lindsay Willmore , Andrew J Ballard , Joshua Bambrick , et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature, pages 1–3, 2024.
|
|