Abstract
Abstract
Background
Pathology synopses consist of semi-structured or unstructured text summarizing visual information by observing human tissue. Experts write and interpret these synopses with high domain-specific knowledge to extract tissue semantics and formulate a diagnosis in the context of ancillary testing and clinical information. The limited number of specialists available to interpret pathology synopses restricts the utility of the inherent information. Deep learning offers a tool for information extraction and automatic feature generation from complex datasets.
Methods
Using an active learning approach, we developed a set of semantic labels for bone marrow aspirate pathology synopses. We then trained a transformer-based deep-learning model to map these synopses to one or more semantic labels, and extracted learned embeddings (i.e., meaningful attributes) from the model’s hidden layer.
Results
Here we demonstrate that with a small amount of training data, a transformer-based natural language model can extract embeddings from pathology synopses that capture diagnostically relevant information. On average, these embeddings can be used to generate semantic labels mapping patients to probable diagnostic groups with a micro-average F1 score of 0.779 Â ± 0.025.
Conclusions
We provide a generalizable deep learning model and approach to unlock the semantic information inherent in pathology synopses toward improved diagnostics, biodiscovery and AI-assisted computational pathology.
Publisher
Springer Science and Business Media LLC
Reference58 articles.
1. Crowley, R. S., Naus, G. J., Stewart III, J. & Friedman, C. P. Development of visual diagnostic expertise in pathology: an information-processing study. J Am Med Inf Assoc 10, 39–51 (2003).
2. Gurcan, M. N. et al. Histopathological image analysis: a review. IEEE Rev. Biomed. Eng. 2, 147–171 (2009).
3. Balogh, E. P., Miller, B. T. & Ball, J. R. in. Available from: https://www.ncbi.nlm.nih.gov/books/NBK338593/.Chap. The Diagnostic Process (National Academies Press (US), Dec. 2015).
4. Pallua, J., Brunner, A., Zelger, B., Schirmer, M. & Haybaeck, J. The future of pathology is digital. Pathol. Res. Pract. 153040 https://linkinghub.elsevier.com/retrieve/pii/S0344033819330596 (2020).
5. Kurc, T. et al. Scalable analysis of big pathology image data cohorts using efficient methods and high-performance computing strategies. BMC Bioinform. 16, 1–21 (2015).
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献