Abstract
The utility of deep neural nets has been demonstrated for mapping hematoxylin-and-eosin (H&E) stained image features to expression of individual genes. However, these models have not been employed to discover clinically relevant spatial biomarkers. Here we develop MOSBY (Multi-Omic translation of whole slide images for Spatial Biomarker discoverY) that leverages contrastive self-supervised pretraining to extract improved H&E whole slide images features, learns a mapping between image and bulk omic profiles (RNA, DNA, and protein), and utilizes tile-level information to discover spatial biomarkers. We validate MOSBY gene and gene set predictions with spatial transcriptomic and serially-sectioned CD8 IHC image data. We demonstrate that MOSBY-inferred colocalization features have survival-predictive power orthogonal to gene expression, and enable concordance indices highly competitive with survival-trained multimodal networks. We identify and validate 1) an ER stress-associated colocalization feature as a chemotherapy-specific risk factor in lung adenocarcinoma, and 2) the colocalization of T effector cell vs cysteine signatures as a negative prognostic factor in multiple cancer indications. The discovery of clinically relevant biologically interpretable spatial biomarkers showcases the utility of the model in unraveling novel insights in cancer biology as well as informing clinical decision-making.