Abstract
AbstractDe novo gene emergence provides a route for new proteins to be formed from previously non-coding DNA. Proteins born in this way are considered random sequences and typically assumed to lack defined structure. While it remains unclear how likely a de novo protein is to assume a soluble and stable tertiary structure, intersecting evidence from random sequence and de novo-designed proteins suggests that native-like biophysical properties are abundant in sequence space. Taking putative de novo proteins identified in human and fly, we experimentally characterize a library of these sequences to assess their solubility and structure propensity. We compare this library to a set of synthetic random proteins with no evolutionary history. Bioinformatic prediction suggests that de novo proteins may have remarkably similar distributions of biophysical properties to unevolved random sequences of a given length and amino acid composition. However, upon expression in vitro, de novo proteins exhibit moderately higher solubility which is further induced by the DnaK chaperone system. We suggest that while synthetic random sequences are a useful proxy for de novo proteins in terms of structure propensity, de novo proteins may be better integrated in the cellular system than random expectation, given their higher solubility.
Funder
EC | Horizon 2020 Framework Programme
Univerzita Karlova v Praze
Volkswagen Foundation
DAAD Research Scholarship for doctoral students
Publisher
Springer Science and Business Media LLC
Subject
Ecology,Ecology, Evolution, Behavior and Systematics
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献