Affiliation:
1. Melanoma Institute Australia The University of Sydney Sydney New South Wales Australia
2. Faculty of Medicine and Health The University of Sydney Sydney New South Wales Australia
3. Tissue Pathology and Diagnostic Oncology Royal Prince Alfred Hospital and NSW Health Pathology Sydney New South Wales Australia
4. Centre for Health Informatics, Australian Institute of Health Innovation Macquarie University Sydney New South Wales Australia
5. Charles Perkins Centre The University of Sydney Sydney New South Wales Australia
Abstract
AbstractBackgroundThe broad histomorphological spectrum of melanocytic pathologies requires large data sets to develop accurate and generalisable deep learning (DL)‐based diagnostic pathology classifiers. Weakly supervised DL promotes utilisation of larger training data sets compared to fully supervised (patch annotation) approaches.ObjectivesTo evaluate weakly supervised DL image classifiers for discriminating melanomas from naevi on haematoxylin and eosin (H&E)‐stained pathology slides.MethodsA representative H&E slide for 260 naevi and 260 melanomas from mucocutaneous sites at one tertiary institution was digitized. Clinicopathological features were recorded for each case including thickness and histological subtype. Whole‐slide or whole‐tissue section labels were applied. The ground truth was established by consensus diagnosis from two pathologists. Multiple‐instance learning models, Trans‐MIL, CLAM and DTFD‐MIL were evaluated at 10×, 20× and 40× magnifications using stratified fivefold Monte Carlo cross‐validation, with 80/10/10 splits for training/validation/test groups, to predict melanoma from naevus. Heatmaps were generated to understand model performance.ResultsNaevi cases were younger (median age: 51 years; melanoma median age: 71.5 years), with more balanced sex distribution (males: 48.8%, melanoma male subgroup: 64.2%). The most frequent histological subtypes of naevi and melanomas were dysplastic compound (n = 99, 38.1%) and superficial spreading (n = 124, 47.7%), respectively. Average AUC (±1 SD) for Trans‐MIL, CLAM and DTFD‐MIL across test groups were 0.9952 ± 0.006, 0.9925 ± 0.0052 and 0.9708 ± 0.0328, at 20× magnification, respectively. Performance of the models varied according to the magnification used. Heatmaps from the two best performing models, Trans‐MIL and CLAM, generally indicated attention on appropriate tissue regions for interpretation.ConclusionsWeakly supervised DL on pathological slides of common mucocutaneous melanocytic tumours provides highly accurate diagnostic value for discrimination of melanomas and naevi. External validation and further assessment on less frequently occurring histologic subtypes and borderline cases using this method is required.
Funder
Royal College of Pathologists of Australasia
University of Sydney
National Health and Medical Research Council
Melanoma Institute Australia