Prediction of protein-carbohydrate binding sites from protein primary sequence-Reference-Cited by-同舟云学术

Prediction of protein-carbohydrate binding sites from protein primary sequence

Published:2024-02-12 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Nawar Quazi Farah,Nafi Md Muhaiminul Islam,Islam Tasnim Nishat,Rahman M Saifur^ORCID

Abstract

AbstractA protein is a large complex macromolecule that has a crucial role in performing most of the work in cells and tissues. It is made up of one or more long chains of amino acid residues. Another important biomolecule, after DNA and protein, is carbohydrate. Carbohydrates interact with proteins to run various biological processes. Several biochemical experiments exist to learn the protein-carbohydrate interactions, but they are expensive, time consuming and challenging. Therefore developing computational techniques for effectively predicting protein-carbohydrate binding interactions from protein primary sequence has given rise to a prominent new field of research. In this study, we proposeStackCBEmbed, an ensemble machine learning model to effectively classify protein-carbohydrate binding interactions at residue level. StackCBEmbed combines traditional sequence-based features along with features derived from a pre-trained transformer-based protein language model. To the best of our knowledge, ours is the first attempt to apply protein language model in predicting protein-carbohydrate binding interactions. StackCBEmbed achieved sensitivity, specificity and balanced accuracy scores of 0.730, 0.821, 0.776 and 0.666, 0.818, 0.742 in two separate independent test sets. This performance is superior compared to the earlier prediction models benchmarked in the same datasets. We thus hope that StackCBEmbed will discover novel protein-carbohydrate interactions and help advance the related fields of research. StackCBEmbed is freely available as python scripts athttps://github.com/nafiislam/StackCBEmbed.

Publisher

Cold Spring Harbor Laboratory

Reference60 articles.

1. An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins

2. Protein-Carbohydrate Interactions Studied by NMR: From Molecular Recognition to Drug Design

3. Carbohydrate Microarrays: An Advanced Technology for Functional Studies of Glycans

4. Michaela Wimmerová , Stanislav Kozmon , Ivona Nečasová , Sushil Kumar Mishra , Jan Komárek , and Jaroslav Koča . Stacking interactions between carbohydrate and protein quantified by combination of theoretical and experimental methods. 2012.

5. Ben Rathje, Caelen Begg , Liv Helland , and Pari Kyars . A review of common shoulder injuries: clavicular fractures and anterior dislocations. MacEwan University Student eJournal, 4(1), 2020.