Abstract
AbstractDespite the fact that tumor microenvironment (TME) and gene mutations are the main determinants of progression of the deadliest cancer in the world – lung cancer – their interrelations are not well understood. Digital pathology data provide a unique insight into the spatial composition of the TME. Various spatial metrics and machine learning approaches were proposed for prediction of either patient survival or gene mutations from these data. Still, these approaches are limited in the scope of analyzed features and in their explainability and as such fail to transfer to clinical practice. Here, we generated 23,199 image patches from 55 hematoxylin-and-eosin (H&E)-stained lung cancer tissue sections and annotated them into 9 different tissue classes. Using this dataset, we trained a deep neural network ARA-CNN, achieving per-class AUC ranging from 0.72 to 0.99. We applied the trained network to segment 467 lung cancer H&E images downloaded from The Cancer Genome Atlas (TCGA) database. We used the segmented images to compute human interpretable features reflecting the heterogeneous composition of the TME, and successfully utilized them to predict patient survival (c-index 0.723) and cancer gene mutations (largest AUC 73.5% for PDGFRB). Our approach can be generalized to different cancer types to inform precision medicine strategies.
Publisher
Cold Spring Harbor Laboratory
Reference67 articles.
1. Is H&E morphology coming to an end?
2. Machine Learning Methods for Histopathological Image Analysis;Comput. Struct. Biotechnol. J,2018
3. Precision histology: how deep learning is poised to revitalize histomorphology for personalized cancer care;Npj Precis. Oncol,2017
4. Understanding the tumor immune microenvironment (TIME) for effective therapy
5. Griffiths, A. J. et al. An Introduction to Genetic Analysis. (W. H. Freeman, 2000).