Author:
Dubourg-Felonneau Geoffroy,Wesego Daniel Mitiku,Akiva Eyal,Varadan Ranjani
Abstract
AbstractWe introduce a new novel dataset namedPiNUI:ProteinInteractions withNearlyUniformImbalance. PiNUI is a dataset of Protein–Protein Interactions (PPI) specifically designed for Machine Learning (ML) applications that offer a higher degree of representativeness of real-world PPI tasks compared to existing ML-ready PPI datasets. We achieve such by increasing the data size and quality, and minimizing the sampling bias of negative interactions. We demonstrate that models trained on PiNUI almost always outperform those trained on conventional PPI datasets when evaluated on various general PPI tasks using external test sets. PiNUI is availablehere.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献