Abstract
Abstract
Objective. In this study, we propose a semi-supervised learning (SSL) scheme using a patch-based deep learning (DL) framework to tackle the challenge of high-precision classification of seven lung tumor growth patterns, despite having a small amount of labeled data in whole slide images (WSIs). This scheme aims to enhance generalization ability with limited data and reduce dependence on large amounts of labeled data. It effectively addresses the common challenge of high demand for labeled data in medical image analysis. Approach. To address these challenges, the study employs a SSL approach enhanced by a dynamic confidence threshold mechanism. This mechanism adjusts based on the quantity and quality of pseudo labels generated. This dynamic thresholding mechanism helps avoid the imbalance of pseudo-label categories and the low number of pseudo-labels that may result from a higher fixed threshold. Furthermore, the research introduces a multi-teacher knowledge distillation (MTKD) technique. This technique adaptively weights predictions from multiple teacher models to transfer reliable knowledge and safeguard student models from low-quality teacher predictions. Main results. The framework underwent rigorous training and evaluation using a dataset of 150 WSIs, each representing one of the seven growth patterns. The experimental results demonstrate that the framework is highly accurate in classifying lung tumor growth patterns in histopathology images. Notably, the performance of the framework is comparable to that of fully supervised models and human pathologists. In addition, the framework’s evaluation metrics on a publicly available dataset are higher than those of previous studies, indicating good generalizability. Significance. This research demonstrates that a SSL approach can achieve results comparable to fully supervised models and expert pathologists, thus opening new possibilities for efficient and cost-effective medical images analysis. The implementation of dynamic confidence thresholding and MTKD techniques represents a significant advancement in applying DL to complex medical image analysis tasks. This advancement could lead to faster and more accurate diagnoses, ultimately improving patient outcomes and fostering the overall progress of healthcare technology.
Funder
National Key Research and Development Program of China
National Natural Science Foundation of China
Reference50 articles.
1. Accuracy comparison of different batch size for a supervised machine learning task with image classification;Aldin,2022
2. A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications;Alzubaidi;J. Big Data,2023
3. Pseudo-labeling and confirmation bias in deep semi-supervised learning;Arazo,2020
4. Do deep nets really need to be deep?;Ba,2014
5. Updates in grading and invasion assessment in lung adenocarcinoma;Borczuk;Mod. Pathol.,2022