Author:
Hu Xiaoyue,Li He,Chen Ming,Qian Junbin,Jiang Hangjin
Abstract
AbstractIntegrating single-cell RNA-sequencing datasets from different sources is a common practice to empower in-depth interrogation for biological insights, where batch effect correction (BEC) is of vital importance. However, an inappropriate BEC may lead to overcorrection and report misleading results on downstream analyses including cell annotation, trajectory inference and cell-cell communication. Hence, we develop the Reference-based Batch Effect Testing (RBET), a novel statistical framework for evaluating the performance of different BEC methods by leveraging housekeeping-gene inspired reference genes and MAC statistics for distribution comparison. Comparing with existing methods, RBET is more powerful on detecting batch effect, overcorrection sensitive, computationally efficient, and robust to large batch effect sizes. Furthermore, extensive multi-scenario real examples show that RBET selects optimal BEC tools for consistent downstream analysis results, which confirm prior biological knowledge. This comprehensive BEC decision-making tool is available as an R package.
Publisher
Cold Spring Harbor Laboratory