Abstract
AbstractPeptide hormones are genome-encoded signal transduction molecules released in multicellular organisms. The dysregulation of hormone release can cause multiple health problems and it is crucial to study these hormones for therapeutic purposes. To help the research community working in this field, we developed a prediction server that classifies hormonal peptides and non-hormonal peptides. The dataset used in this study was collected for both plants and animals from Hmrbase2 and PeptideAtlas databases. It comprises non-redundant 1174 hormonal and 1174 non-hormonal peptide sequences which were combined and divided into 80% training and 20% validation sets. We extracted a wide variety of compositional features from these sequences to develop various Machine Learning (ML) and Deep Learning (DL) models. The best performing model was logistic regression model trained on top 50 features which achieved an AUROC of 0.93. To enhance the performance of ML model, we applied Basic Local Alignment Search Tool (BLAST) to identify hormonal sequences using similarity among them, and motif search using Motif-Emerging and Classes-Identification (MERCI) to detect motifs present in hormonal and non-hormonal sequences. We combined our best performing classification model, i.e., logistic regression model with BLAST and MERCI to form a hybrid model that can predict hormonal peptide sequences accurately. The hybrid model is able to achieve an AUROC of 0.96, an accuracy of 89.79%, and an MCC of 0.8 on the validation set. This hybrid model has been incorporated on the publicly available website of HOPPred athttps://webs.iiitd.edu.in/raghava/hoppred/.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献