Author:
Fan Fan,Martinez Georgia,DeSilvio Thomas,Shin John,Chen Yijiang,Jacobs Jackson,Wang Bangchen,Ozeki Takaya,Lafarge Maxime W.,Koelzer Viktor H.,Barisoni Laura,Madabhushi Anant,Viswanath Satish E.,Janowczyk Andrew
Abstract
AbstractBatch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder (http://cohortfinder.com), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.
Publisher
Springer Science and Business Media LLC