Author:
Zhang Shengli, ,Zhao Ya,Liang Yunyun, ,
Abstract
In recent years, bacterial resistance becomes a serious problem due to the abuse of antibiotics. Antimicrobial peptides (AMPs) have rapidly emerged as the best alternative to antibiotics because of their ability to rapidly target bacteria, fungi, viruses, and cancer cells and counteract the toxins they produce. In this study, a two-branch ensemble framework is proposed to identify AMPs, which integrates extreme gradient boosting (XGBoost) and bidirectional long short-term memory network (Bi-LSTM) with attention mechanism to form a stronger model. First, one-hot coding and -mer are used to represent the sequence features. Then, the feature vectors are input into the two base classifiers respectively to obtain two predicted values. Finally, the prediction results are obtained by compromise. As one of the classical machine learning methods, XGBoost has strong stability and can adapt to datasets of different sizes. Bi-LSTM recurses for each peptide from N-terminal to C-terminal and C-terminal to N-terminal, respectively. As the context information is provided, the model can make more accurate prediction. Our method achieves higher or highly comparable results across the eight independent test datasets. The ACC values of XUAMP, YADAMP, DRAMP, CAMP, LAMP, APD3, dbAMP, and DBAASP are 77.9%, 98.5%, 72.5%, 99.8%, 83.0%, 92.4%, 87.5%, and 84.6%, respectively. This shows that the two-branch ensemble structure is feasible and has strong generalization. The codes and datasets are accessible at https://github.com/z11code/AMP-EF.
Publisher
University Library in Kragujevac
Subject
Applied Mathematics,Computational Theory and Mathematics,Computer Science Applications,General Chemistry
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献