Abstract
AbstractThe methodology of Specific Peptides (SP) has been introduced within the context of enzymes. It is based on an unsupervised machine leaning (ML) tool for motif extraction, followed by supervised annotation of the motifs. In the case of enzymes, the classifier is the Enzyme Classification (EC) number. Here we demonstrate that this method reaches precision of 96.5% and recall of 89.1% on presently available protein sequences. We also apply this method to two other protein families, GPCR and ZF, find their corresponding SPs, and provide the code for searching any protein sequence for its classification under any such family.
Publisher
Cold Spring Harbor Laboratory
Reference13 articles.
1. Functional representation of enzymes by specific peptides;PLOS Computational Biology,2007
2. Biological roles of specific peptides in enzymes;Proteins: Structure, Function, and Bioinformatics,2008
3. Data mining of enzymes using specific peptides
4. Swissprot, as provided by Uniprot. https://ftp.uniprot.org/pub/databases/uniprot/previous_major_releases/release-2021_01/knowledgebase/UniProtKB_SwissProt-relstat.html
5. Common peptides shed light on evolution of Olfactory Receptors BMC Evolutionary Biology;Olfactory Receptors Search Website,2009