Affiliation:
1. College of Information Engineering, Beijing Institute of Petrochemical Technology, 19 Qingyuan North Road, Daxing District, Beijing 102617, P. R. China
Abstract
The polyproline-II (PPII) structure domain is crucial in organisms’ signal transduction, transcription, cell metabolism, and immune response. It is also a critical structural domain for specific vital disease-associated proteins. Recognizing PPII is essential for understanding protein structure and function. To accurately predict PPII in proteins, we propose a novel method, AAindex-PPII, which only adopts amino acid index to characterize protein sequences and uses a Bidirectional Gated Recurrent Unit (BiGRU)-Improved TextCNN composite deep learning model to predict PPII in proteins. Experimental results show that, when tested on the same datasets, our method outperforms the state-of-the-art BERT-PPII method, achieving an AUC value of 0.845 on the strict data and an AUC value of 0.813 on the non-strict data, which is 0.024 and 0.03 higher than that of the BERT-PPII method. This study demonstrates that our proposed method is simple and efficient for PPII prediction without using pre-trained large models or complex features such as position-specific scoring matrices.
Funder
The Cross-Disciplinary Science Foundation from Beijing Institute of Petrochemical Technology
Publisher
World Scientific Pub Co Pte Ltd
Subject
Computer Science Applications,Molecular Biology,Biochemistry
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献