Abstract
AbstractProtein-protein interactions (PPIs) are essential to understanding biological pathways as well as their roles in development and disease. Computational tools have been successful at predicting PPIsin silico, but the lack of consistent and reliable frameworks for this task has led to network models that are difficult to compare and discrepancies between algorithms that remain unexplained. To better understand the underlying inference mechanisms that underpin these models, we designed B4PPI, an open-source framework for benchmarking that accounts for a range of biological and statistical pitfalls while facilitating reproducibility. We use B4PPI to shed light on the impact of network topology and how different algorithms deal with highly connected proteins. By studying functional genomics-based and sequence-based models (two of the most popular approaches) on human PPIs, we show their complementarity as the former performs best on lone proteins while the latter specialises in interactions involving hubs. We also show that algorithm design has little impact on performance with functional genomic data. We replicate our results between both human andS. cerevisiaedata and demonstrate that models using functional genomics are better suited to PPI prediction across species. With rapidly increasing amounts of sequence and functional genomics data, our study provides a principled foundation for future construction, comparison and application of PPI networks.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献