Abstract
AbstractProtein complexes are groups of two or more polypeptide chains that join together to build noncovalent networks of protein interactions. A number of means of computing the ways in which protein complexes and their members can be identified from these interaction networks have been created. While most of the existing methods identify protein complexes from the protein-protein interaction networks (PPIs) at a fairly decent level, the applicability of advanced graph network methods has not yet been adequately investigated. In this paper, we proposed various graph convolutional networks (GCNs) methods to improve the detection of the protein functional complexes. We first formulated the protein complex detection problem as a node classification problem. Second, the Neural Overlapping Community Detection (NOCD) model was applied to cluster the nodes (proteins) using a complex affiliation matrix. A representation learning approach, which combines the multi-class GCN feature extractor (to obtain the features of the nodes) and the mean shift clustering algorithm (to perform clustering), is also presented. We have also improved the efficiency of the multi-class GCN network to reduce space and time complexities by converting the dense-dense matrix operations into dense-spares or sparse-sparse matrix operations. This proposed solution significantly improves the scalability of the existing GCN network. Finally, we apply clustering aggregation to find the best protein complexes. A grid search was performed on various detected complexes obtained by applying three well-known protein detection methods namely ClusterONE, CMC, and PEWCC with the help of the Meta-Clustering Algorithm (MCLA) and Hybrid Bipartite Graph Formulation (HBGF) algorithm. The proposed GCN-based methods were tested on various publicly available datasets and provided significantly better performance than the previous state-of-the-art methods. The code and data used in this study are available from https://github.com/Analystharsh/GCN_complex_detection
Publisher
Cold Spring Harbor Laboratory
Reference61 articles.
1. Protein-protein interaction detection based on substring sensitivity measure;Inter J of Biomedical Sciences,2006
2. Edge-count probabilities for the identification of local protein communities and their organization;Proteins: Structure, Function, and Bioinformatics,2006
3. Liu, Hongbiao and Liu, Juan , “Clustering protein interaction data through chaotic genetic algorithm,” in Asia-Pacific Conference on Simulated Evolution and Learning, Springer, 2006, pp. 858–864.
4. Development and implementation of an algorithm for detection of protein complexes in large interaction networks
5. Zaki, Nazar and Alashwal, Hany , “Improving the Detection of Protein Complexes by Predicting Novel Missing Interactome Links in the Protein-Protein Interaction Network,” in 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2018.