Abstract
AbstractProtein subcellular localization prediction plays a crucial role in improving our understanding of different diseases and consequently assists in building drug targeting and drug development pipelines. Proteins are known to co-exist at multiple subcellular locations which make the task of prediction extremely challenging. A protein interaction network is a graph that captures interactions between different proteins. It is safe to assume that if two proteins are interacting, they must share some subcellular locations. With this regard, we propose ProtFinder – the first deep learning-based model that exclusively relies on protein interaction networks to predict the multiple subcellular locations of proteins. We also integrate biological priors like the cellular component of Gene Ontology to make ProtFinder a more biology-aware intelligent system. ProtFinder is trained and tested using the STRING and BioPlex databases whereas the annotations of proteins are obtained from the Human Protein Atlas. Our model obtained an AUC-ROC score of 90.00% and an MCC score of 83.42% on a held-out set of proteins. We also apply ProtFinder to annotate proteins that currently do not have confident location annotations. We observe that ProtFinder is able to confirm some of these unreliable location annotations, while in some cases complementing the existing databases with novel location annotations. The source code for ProtFinder is available at https://github.com/UCLouvain-CBIO/ProtFinder.
Publisher
Cold Spring Harbor Laboratory
Reference37 articles.
1. Bruce Alberts , Alexander Johnson , Julian Lewis , Martin Raff , Keith Roberts , and Peter Walter . Analyzing protein structure and function. In Molecular Biology of the Cell. 4th edition. Garland Science, 2002.
2. Hpslpred: an ensemble multi-label classifier for human protein subcellular location prediction with imbalanced source;Proteomics,2017
3. Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization
4. Recent progress in protein subcellular location prediction
5. Xiaoyong Pan , Lei Chen , Min Liu , Tao Huang , and Yu-Dong Cai . Predicting protein subcellular location using learned distributed representations from a protein-protein network. BioRxiv, page 768739, 2019.