Application of artificial intelligence and machine learning techniques to the analysis of dynamic protein sequences-Reference-Cited by-同舟云学术

Application of artificial intelligence and machine learning techniques to the analysis of dynamic protein sequences

Published:2024-05-29 Issue:10 Volume:92 Page:1234-1241
ISSN:0887-3585
Container-title:Proteins: Structure, Function, and Bioinformatics
language:en
Short-container-title:Proteins

Author:

Kombo David C.¹,LaMarche Matthew J.¹,Konkankit Chilaluck C.²,Rackovsky S.²^ORCID

Affiliation:

1. Department of Medicinal Chemistry Integrated Drug Discovery, Sanofi Cambridge Massachusetts USA

2. Department of Chemistry and Chemical Biology, Baker Laboratory Cornell University Ithaca New York USA

Abstract

AbstractWe apply methods of Artificial Intelligence and Machine Learning to protein dynamic bioinformatics. We rewrite the sequences of a large protein data set, containing both folded and intrinsically disordered molecules, using a representation developed previously, which encodes the intrinsic dynamic properties of the naturally occurring amino acids. We Fourier analyze the resulting sequences. It is demonstrated that classification models built using several different supervised learning methods are able to successfully distinguish folded from intrinsically disordered proteins from sequence alone. It is further shown that the most important sequence property for this discrimination is the sequence mobility, which is the sequence averaged value of the residue‐specific average alpha carbon B factor. This is in agreement with previous work, in which we have demonstrated the central role played by the sequence mobility in protein dynamic bioinformatics and biophysics. This finding opens a path to the application of dynamic bioinformatics, in combination with machine learning algorithms, to a range of significant biomedical problems.

Funder

National Institutes of Health

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/prot.26704

Reference35 articles.

1. AlphaFold 2: Why It Works and Its Implications for Understanding the Relationships of Protein Sequence, Structure, and Function

2. AlphaFold2 fails to predict protein fold switching

3. The protein-folding problem: Not yet solved

4. Intrinsically Disordered Proteins in Human Diseases: Introducing the D2 Concept