XGboost-Ampy: Identification of AMPylation Protein Function Prediction Using Machine Learning-Reference-Cited by-同舟云学术

XGboost-Ampy: Identification of AMPylation Protein Function Prediction Using Machine Learning

Published:2022-12-31 Issue:2 Volume:10 Page:83-95
ISSN:2308-8168
Container-title:VAWKUM Transactions on Computer Sciences
language:
Short-container-title:VAWKUM trans. comput. sci.

Author:

Khan Swati Zar Nawab,Ghulam Ali,Sohail Muhammad,Arshed Jawad Usman,Sikander Rahu,Malik Muhammad Shahid,Khan Nauman

Abstract

A developing post-translational modification known as AMPylation involves the formation of a phosphodiester bond on the hydroxyl group of threonine, serine, or tyrosine. Adenosine monophosphate is covalently attached to the side chain of an amino acid in a peptide during this process, which is catalyzed by AMPylation. We used AMPylation peptide sequence data from bacteria, eukaryotes, and archaea to train the models. Then, we compared the results of several feature extraction methods and their combinations in addition to classification algorithms to obtain more accurate prediction models. To prevent additional loss of sequence information, the PseAAC feature is employed to construct a fixed-size descriptor value in vector space. The basic feature set is received from 2nd features extraction method. All of this was accomplished by deriving the protein characteristics from the evolutionary data and sequence of the BLOUSM62 amino acid residue. The eXtreme Gradient Boosting (XGBoost) technique was used to create a novel model for the current study, which was then compared to the most popular machine learning models. In this research, we proposed framework for AMPylation identification that makes use of the XGBoost algorithm (AMPylation) and sequence-derived functions. XGBoost -Ampy has an accuracy of 86.7%, a sensitivity of 76.1%, a specificity of 97.5%, and a Matthews’s correlation coefficient (MCC) of 0.753 for predicting AMylation sites. XGBoost -Amp, the first machine learning model developed, has shown promise and may be able to help with this problem.

Publisher

VFAST Research Platform

Reference51 articles.

1. Brown, M. S., A. Segal, and E. R. Stadtman. "Modulation of glutamine synthetase adenylylation and deadenylylation is mediated by metabolic transformation of the PII-regulatory protein." Proceedings of the National Academy of Sciences., vol. 68, no. 12 pp. 2949-2953, 1971.

2. O. N. Jensen, “Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry,” Curr. Opin. Chem. Biol., vol. 8, no. 1, pp. 33–41, 2004.

3. Kia-Ki, Han, and Arlette Martinage. "Post-translational chemical modification (s) of proteins." International journal of biochemistry., vol. 24, no. 1, pp. 19-28, 1992.

4. Jensen, Ole Nørregaard. "Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry." Current opinion in chemical biology., vol. 8, no. 1, pp. 33-41, 2004.

5. Krishna, Radha G., and Finn Wold. "Post-translational modifications of proteins." Methods in protein sequence analysis., pp. 167-172, 1993.