Issues in performance evaluation for host–pathogen protein interaction prediction-Reference-Cited by-同舟云学术

Issues in performance evaluation for host–pathogen protein interaction prediction

Published:2016-06 Issue:03 Volume:14 Page:1650011
ISSN:0219-7200
Container-title:Journal of Bioinformatics and Computational Biology
language:en
Short-container-title:J. Bioinform. Comput. Biol.

Author:

Abbasi Wajid Arshad¹,Minhas Fayyaz Ul Amir Afsar¹

Affiliation:

1. Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan

Abstract

The study of interactions between host and pathogen proteins is important for understanding the underlying mechanisms of infectious diseases and for developing novel therapeutic solutions. Wet-lab techniques for detecting protein–protein interactions (PPIs) can benefit from computational predictions. Machine learning is one of the computational approaches that can assist biologists by predicting promising PPIs. A number of machine learning based methods for predicting host–pathogen interactions (HPI) have been proposed in the literature. The techniques used for assessing the accuracy of such predictors are of critical importance in this domain. In this paper, we question the effectiveness of K-fold cross-validation for estimating the generalization ability of HPI prediction for proteins with no known interactions. K-fold cross-validation does not model this scenario, and we demonstrate a sizable difference between its performance and the performance of an alternative evaluation scheme called leave one pathogen protein out (LOPO) cross-validation. LOPO is more effective in modeling the real world use of HPI predictors, specifically for cases in which no information about the interacting partners of a pathogen protein is available during training. We also point out that currently used metrics such as areas under the precision-recall or receiver operating characteristic curves are not intuitive to biologists and propose simpler and more directly interpretable metrics for this purpose.

Publisher

World Scientific Pub Co Pte Lt

Subject

Computer Science Applications,Molecular Biology,Biochemistry

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0219720016500116

Reference44 articles.

1. Computational prediction of host-pathogen protein–protein interactions

2. The Landscape of Human Proteins Interacting with Viruses and Other Pathogens

3. The genome sequence of Bacillus anthracis Ames and comparison to closely related bacteria

4. A draft map of the human proteome

Cited by 22 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Deep Learning Framework for Predicting Protein Functions With Co-Occurrence of GO Terms;IEEE/ACM Transactions on Computational Biology and Bioinformatics;2023-03-01

2. Analysing Wireless Capsule Endoscopy Images Using Deep Learning Frameworks to Classify Different GI Tract Diseases;2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM);2023-01-03

3. Machine learning methods for protein-protein binding affinity prediction in protein design;Frontiers in Bioinformatics;2022-12-16

4. COYOTE: Sequence-derived structural descriptors-based computational identification of glycoproteins;Journal of Bioinformatics and Computational Biology;2022-09-12

5. deepHPI: a comprehensive deep learning platform for accurate prediction and visualization of host–pathogen protein–protein interactions;Briefings in Bioinformatics;2022-04-30