Abstract
SummaryHigh-throughput image-based profiling platforms are powerful technologies capable of collecting data from billions of cells exposed to thousands perturbations in a time- and cost-effective manner. Therefore, image-based profiling data has been increasingly used for diverse biological applications, such as predicting drug mechanism of action or gene function. However, batch effects pose severe limitations to community-wide efforts to integrate and interpret image-based profiling data collected across different laboratories and equipment. To address this problem, we evaluated seven top-ranked batch correction strategies for mRNA profiles in the context of a newly released Cell Painting dataset, the largest publicly accessible image-based dataset. We focused on five different use scenarios with varying complexity, and found that Harmony, a nonlinear method, consistently outperformed the other tested methods. Furthermore, we provide a framework, benchmark, and metrics for the future assessment of new batch correction methods. Overall, this work paves the way for improvements that allow the community to make best use of public Cell Painting data for scientific discovery.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献