Abstract
AbstractThe rise of single-cell RNA-sequencing (scRNA-seq) and evolved computational algorithms have significantly advanced biomedical science by revealing and visualizing the multifaceted and diverse nature of single cells. These technical advancements have also highlighted the pivotal role of cell clusters as representations of biologically universal entities such as cell types and cell states. However, to some extent, these clusterings remain dataset-specific and method-dependent. To improve comparability across different datasets or compositions, we previously introduced a graph-based representation of cell collections that captures the statistical dependencies of their characteristic genes.While our earlier work focused on theoretical insights, it was not sufficiently adapted and fine-tuned for practical implementation. To address this, the present paper introduces an improved practice to define and evaluate cellular identities based on our theory. First, we provide a concise summary of our previous theory and workflow. Then, point-by-point, we highlight the issues that needed fixing and propose solutions. The framework’s utility was enhanced by leveraging alternative formats of cellular features such as gene ontology (GO) terms and effectively handling dropouts. Supplemental techniques are offered to reinforce the versatility and robustness of our method.
Publisher
Cold Spring Harbor Laboratory