Abstract
AbstractRecent spatial transcriptomics (ST) technologies have enabled sub-single-cell resolution profiling of gene expression across the whole transcriptome. However, the transition to high-definition ST significantly increased sparsity and dimensionality, posing computational challenges in discerning cell identities, understanding neighborhood structure, and identifying differential expression - all are crucial steps to study normal and disease ST samples. Here we present STHD, a novel machine learning method for probabilistic cell typing of single spots in whole-transcriptome, high-resolution ST data. Unlike current binning-aggregation-deconvolution strategy, STHD directly models gene expression at single-spot level to infer cell type identities. It addresses sparsity by modeling count statistics, incorporating neighbor similarities, and leveraging reference single-cell RNA-seq data. We demonstrated that STHD accurately predicts cell type identities at single-spot level, which automatically achieved precise segmentation of global tissue architecture and local multicellular neighborhoods. The STHD labels facilitated various downstream analyses, including cell type-stratified bin aggregation, spatial compositional comparison, and cell type-specific differential expression analyses. These high-resolution labels further defined frontlines of inter-cell type interactions, revealing direct cell-cell communication activities at immune hubs of a colon cancer sample. Overall, computational modeling of high-resolution spots with STHD uncovers precise spatial organization and deeper biological insights for disease mechanisms.
Publisher
Cold Spring Harbor Laboratory