FLIP: Benchmark tasks in fitness landscape inference for proteins-Reference-Cited by-同舟云学术

FLIP: Benchmark tasks in fitness landscape inference for proteins

Published:2021-11-11 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Dallago Christian^ORCID,Mou Jody,Johnston Kadina E.,Wittmann Bruce J.,Bhattacharya Nicholas,Goldman Samuel,Madani Ali,Yang Kevin K.

Abstract

AbstractMachine learning could enable an unprecedented level of control in protein engineering for therapeutic and industrial applications. Critical to its use in designing proteins with desired properties, machine learning models must capture the protein sequence-function relationship, often termed fitness landscape. Existing bench-marks like CASP or CAFA assess structure and function predictions of proteins, respectively, yet they do not target metrics relevant for protein engineering. In this work, we introduce Fitness Landscape Inference for Proteins (FLIP), a benchmark for function prediction to encourage rapid scoring of representation learning for protein engineering. Our curated tasks, baselines, and metrics probe model generalization in settings relevant for protein engineering, e.g. low-resource and extrapolative. Currently, FLIP encompasses experimental data across adeno-associated virus stability for gene therapy, protein domain B1 stability and immunoglobulin binding, and thermostability from multiple protein families. In order to enable ease of use and future expansion to new tasks, all data are presented in a standard format. FLIP scripts and data are freely accessible at https://benchmark.protein.properties.

Publisher

Cold Spring Harbor Laboratory

Reference63 articles.

1. Engineered enzymes for chemical production

2. Protein engineering in designing tailored enzymes and microorganisms for biofuels production

3. Protein engineering and its applications in food industry

4. Design of an in vitro biocatalytic cascade for the manufacture of islatravir

5. Recent advances in user-friendly computational tools to engineer protein function

Cited by 65 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Fine-tuning protein language models boosts predictions across diverse tasks;Nature Communications;2024-08-28

2. Benchmarking text-integrated protein language model embeddings and embedding fusion on diverse downstream tasks;2024-08-26

3. Results of the Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design;2024-08-12

4. Simple, Efficient, and Scalable Structure-Aware Adapter Boosts Protein Language Models;Journal of Chemical Information and Modeling;2024-08-07

5. PETA: evaluating the impact of protein transfer learning with sub-word tokenization on downstream applications;Journal of Cheminformatics;2024-08-02