Abstract
ABSTRACTConvolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture), which we show can be construed as abstract morphological genes (“mones”) with strong independent associations to biological phenotypes. We observe that many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC=97.1% ± 2.8% for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC=99.2% ± 0.12%). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values.
Publisher
Cold Spring Harbor Laboratory
Reference51 articles.
1. Javad Noorbakhsh , Saman Farahmand , Ali Foroughi pour , Sandeep Namburi , Dennis Caruana , David Rimm , Mohammad Soltanieh-ha , Kourosh Zarringhalam , and Jeffrey H. Chuang . Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images. nature communications.
2. Deep learning for healthcare: review, opportunities and challenges
3. Artificial intelligence in lung cancer pathology image analysis;Cancers,2019
4. Samuel Dodge and Lina Karam . Understanding how image quality affects deep neural networks. In 2016 eighth international conference on quality of multimedia experience (QoMEX), pages 1–6. IEEE, 2016.
5. Tejal Nair , Ali Foroughi pour , and Jeffrey H. Chuang . The effect of blurring on lung cancer subtype classification accuracy of convolutional neural networks. In IEEE conference on bioinformatics and biomedicine, pages 2987–2989. IEEE, 2020.