Abstract
AbstractThe core task when analyzing single-cell data (SCD) is dimensional reduction (DR), which aims to find latent signals encoding biological heterogeneity. Here, we dissected DR steps to build TopOMetry, a machine learning framework that learns latent data topology to perform DR in a modular fashion and show that current analysis practices are biased due to the non-uniformity and non-linearity of the geometry underlying SCD. We used TopOMetry to analyze SCD from peripheral blood mononuclear cells (PBMC), and consistently found a plethora of unreported T CD4 subpopulations that are missed by current workflows across datasets and diseases. We also found comparable T CD4 diversity in the human cerebrospinal fluid. The proposed framework scalates to millions of cells, excels at discovering fine-grained local structure, and holds natural connections to clustering and trajectory inference. We hope this powerful tool will accelerate biomedical discovery and inspire new methods to learn and explore phenotypic topology.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献