A systematic analysis of regression models for protein engineering-Reference-Cited by-同舟云学术

A systematic analysis of regression models for protein engineering

Published:2024-05-03 Issue:5 Volume:20 Page:e1012061
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Michael Richard^ORCID,Kæstel-Hansen Jacob^ORCID,Mørch Groth Peter^ORCID,Bartels Simon,Salomon Jesper,Tian Pengfei,Hatzakis Nikos S.,Boomsma Wouter^ORCID

Abstract

To optimize proteins for particular traits holds great promise for industrial and pharmaceutical purposes. Machine Learning is increasingly applied in this field to predict properties of proteins, thereby guiding the experimental optimization process. A natural question is: How much progress are we making with such predictions, and how important is the choice of regressor and representation? In this paper, we demonstrate that different assessment criteria for regressor performance can lead to dramatically different conclusions, depending on the choice of metric, and how one defines generalization. We highlight the fundamental issues of sample bias in typical regression scenarios and how this can lead to misleading conclusions about regressor performance. Finally, we make the case for the importance of calibrated uncertainty in this domain.

Funder

Danish Data Science Academy

NNF Center for 4D cellular dynamics

Villum Synergy

Innovation Fund Denmark

MLLS Center

Digital Pilot Hub

Pioneer Centre for AI

Publisher

Public Library of Science (PLoS)

Reference61 articles.

1. Protein misfolding and degradation in genetic diseases;P Bross;Human mutation,1999

2. Proteomics: new perspectives, new biomedical opportunities;RE Banks;The Lancet,2000

3. Protein engineering 20 years on;JA Brannigan;Nature Reviews Molecular Cell Biology,2002

4. Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection;F Morcos;Proceedings of the National Academy of Sciences,2014

5. How many protein sequences fold to a given structure? A coevolutionary analysis;P Tian;Biophysical journal,2017

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Kermut: Composite kernel regression for protein variant effects;2024-05-29