Abstract
AbstractUnderstanding the emergence and structural characteristics ofde novoand random proteins is crucial for unraveling protein evolution and designing novel enzymes. However, experimental determination of their structures remains challenging. Recent advancements in protein structure prediction, particularly with AlphaFold2 (AF2), have expanded our knowledge of protein structures, but their applicability tode novoand random proteins is unclear. In this study, we investigate the structural predictions and confidence scores of AF2 and protein language model (pLM)-based predictor ESMFold forde novo, random, and conserved proteins. We find that the structural predictions forde novoand random proteins differ significantly from conserved proteins. Interestingly, a positive correlation between disorder and confidence scores (pLDDT) is observed forde novoand random proteins, in contrast to the negative correlation observed for conserved proteins. Furthermore, the performance of structure predictors forde novoand random proteins is hampered by the lack of sequence identity. We also observe varying predicted disorder among different sequence length quartiles for random proteins, suggesting an influence of sequence length on disorder predictions. In conclusion, while structure predictors provide initial insights into the structural composition ofde novoand random proteins, their accuracy and applicability to such proteins remain limited. Experimental determination of their structures is necessary for a comprehensive understanding. The positive correlation between disorder and pLDDT could imply a potential for conditional folding and transient binding interactions ofde novoand random proteins.
Publisher
Cold Spring Harbor Laboratory
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献