Abstract
Spatial transcriptomics (ST) revolutionizes RNA quantification with high spatial resolution. Hematoxylin and eosin (H&E) images, the gold standard in medical diagnosis, offer insights into tissue structure, correlating with gene expression patterns. Current methods for predicting spatial gene expression from H&E images often overlook spatial relationships. We introduce ResSAT (Residual networks - Self-Attention Transformer), a framework generating spatially resolved gene expression profiles from H&E images by capturing tissue structures and using a self-attention transformer to enhance prediction. Benchmarking on 10x Visium datasets, ResSAT significantly outperformed existing methods, promising reduced ST profiling costs and rapid acquisition of numerous profiles.