Estrogen Receptor Gene Expression Prediction from H&E Whole Slide Images
Author:
Srinivas Anvita A.,Jaroensri Ronnachai,Wulczyn Ellery,Wren James H.,Thompson Elaine E.,Olson Niels,Beckers Fabien,Miao Melissa,Liu Yun,Chen Po-Hsuan Cameron,Steiner David F.
Abstract
AbstractGene expression profiling (GEP) provides valuable information for the care of breast cancer patients. However, the test itself is expensive and can take a long time to process. In contrast, microscopic examination of hematoxylin and eosin (H&E) stained tissue is inexpensive, fast, and integrated into the standard of care. This work explores the possibility of predictingESR1gene expression from H&E images, and its use in predicting clinical variables and patient outcomes. We utilized a weakly supervised method to train a deep learning model to predictESR1expression from whole slide images, and achieved 0.57 [95% CI: 0.46, 0.67] Pearson’s correlation with the ground truth value. OurESR1expression prediction achieved an AUROC of 0.81 [0.74, 0.87] in predicting clinical ER status obtained using an immunohistochemistry staining technique, and a c-index of 0.59 [0.52, 0.65] in predicting progression-free interval for the patients in our cohort. This work further demonstrates the potential to infer gene expression from H&E stained images in a manner that shows meaningful associations with clinical variables. Because obtaining H&E stained images is substantially easier and faster than genetic testing, the capability to derive molecular genetic information from these images may increase access to this type of information for patient risk stratification and provide research insights into molecular-morphological associations.
Publisher
Cold Spring Harbor Laboratory
Reference11 articles.
1. Robust and efficient medical imaging with self-supervision;arXiv preprint,2022
2. Determining breast cancer biomarker status and associated morphological features using deep learning;Communications medicine,2021
3. Maximilian Ilse , Jakub Tomczak , and Max Welling . Attention-based deep multiple instance learning. In International conference on machine learning, pages 2127–2136. PMLR, 2018.
4. National Cancer Institute. mrna analysis pipeline. https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/, July 2022.
5. Deep learning models for histologic grading of breast cancer and association with disease prognosis;NPJ breast cancer,2022