Abstract
ABSTRACTBatch effect is a frequent challenge in deep sequencing data analysis that can lead to misleading conclusions. Existing methods do not correct batch effects satisfactorily, especially with single-cell RNA sequencing (scRNA-seq) data. To address this challenge, we introduce fast-scBatch, a novel and efficient two-phase algorithm for batch-effect correction in scRNA-seq data, designed to handle non-linear and complex batch effects. Specifically, this method utilizes the inherent correlation structure of the data for batch effect correction and employs a neural network to expedite the process. It outputs a corrected expression matrix, facilitating downstream analyses. We validated fast-scBatch through simulation studies and on two scRNA-seq datasets, demonstrating its superior performance in batch-effect correction compared to current methods, as evidenced by visualization using UMAP plots, and metrics including Adjusted Rand Index (ARI) and Adjusted Mutual Information (AMI).
Publisher
Cold Spring Harbor Laboratory