Abstract
AbstractSingle-cell RNA sequencing enables studying cells individually, yet high gene dimensions and low cell numbers challenge the analysis. And only a subset of the genes detected are involved in the biological processes underlying cell-type specific functions. We present COMSE, an unsupervised feature selection framework using community detection to capture informative genes from scRNA-seq data. COMSE identified cell substates with high resolution, as demonstrated by its capacity in distinguishing cells at different stages of the cell cycle. Evaluations based on real and simulated scRNA-seq datasets showed COMSE outperformed methods even at high dropout rates in cell clustering. We also demonstrate that by identifying communities of genes associated with batch effects, COMSE differentiates biological differences from batch effects, thereby enabling integrated analysis of scRNA-seq datasets generated with different platforms.
Publisher
Cold Spring Harbor Laboratory