HN-PPISP: a hybrid network based on MLP-Mixer for protein–protein interaction site prediction-Reference-Cited by-同舟云学术

HN-PPISP: a hybrid network based on MLP-Mixer for protein–protein interaction site prediction

Published:2022-11-19 Issue:1 Volume:24 Page:
ISSN:1467-5463
Container-title:Briefings in Bioinformatics
language:en
Short-container-title:

Author:

Kang Yan¹,Xu Yulong¹,Wang Xinchao¹,Pu Bin²,Yang Xuekun¹,Rao Yulong¹,Chen Jianguo³

Affiliation:

1. National Pilot School of Software, Yunnan University , Kunming, 650091, P.R . China

2. College of Computer Science and Electronic Engineeringg, Hunan University , Changsha, 410082, P.R . China

3. School of Software Engineering, Sun Yat-Sen University , Zhuhai, 519082, P.R . China

Abstract

AbstractMotivationBiological experimental approaches to protein–protein interaction (PPI) site prediction are critical for understanding the mechanisms of biochemical processes but are time-consuming and laborious. With the development of Deep Learning (DL) techniques, the most popular Convolutional Neural Networks (CNN)-based methods have been proposed to address these problems. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in protein sequences. Current methods cannot efficiently explore the nature of Position Specific Scoring Matrix (PSSM), secondary structure and raw protein sequences by processing them all together. For PPI site prediction, how to effectively model the PPI context with attention to prediction remains an open problem. In addition, the long-distance dependencies of PPI features are important, which is very challenging for many CNN-based methods because the innate ability of CNN is difficult to outperform auto-regressive models like Transformers.ResultsTo effectively mine the properties of PPI features, a novel hybrid neural network named HN-PPISP is proposed, which integrates a Multi-layer Perceptron Mixer (MLP-Mixer) module for local feature extraction and a two-stage multi-branch module for global feature capture. The model merits Transformer, TextCNN and Bi-LSTM as a powerful alternative for PPI site prediction. On the one hand, this is the first application of an advanced Transformer (i.e. MLP-Mixer) with a hybrid network for sequence-based PPI prediction. On the other hand, unlike existing methods that treat global features altogether, the proposed two-stage multi-branch hybrid module firstly assigns different attention scores to the input features and then encodes the feature through different branch modules. In the first stage, different improved attention modules are hybridized to extract features from the raw protein sequences, secondary structure and PSSM, respectively. In the second stage, a multi-branch network is designed to aggregate information from both branches in parallel. The two branches encode the features and extract dependencies through several operations such as TextCNN, Bi-LSTM and different activation functions. Experimental results on real-world public datasets show that our model consistently achieves state-of-the-art performance over seven remarkable baselines.AvailabilityThe source code of HN-PPISP model is available at https://github.com/ylxu05/HN-PPISP.

Funder

National Natural Science Foundation of China

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Link

https://academic.oup.com/bib/article-pdf/24/1/bbac480/48782521/bbac480.pdf

Reference43 articles.

1. Evidence for dynamically organized modularity in the yeast protein-protein interaction network.[J];Han;Nature,2004

2. Interaction network containing conserved and essential protein complexes in Escherichia coli[J];Butland;Nature,2005

3. Towards a proteome-scale map of the human protein-protein interaction network.[J];Rual;Nature,2005