Medical Image Classification with a Hybrid SSM Model Based on CNN and Transformer-Reference-Cited by-同舟云学术

Medical Image Classification with a Hybrid SSM Model Based on CNN and Transformer

Published:2024-08-05 Issue:15 Volume:13 Page:3094
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Hu Can¹^ORCID,Cao Ning¹^ORCID,Zhou Han²^ORCID,Guo Bin³^ORCID

Affiliation:

1. School of Computer and Soft, Hohai University, Nanjing 211100, China

2. School of Electronic Science and Engineering, Nanjing University, Nanjing 210046, China

3. College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China

Abstract

Medical image classification, a pivotal task for diagnostic accuracy, poses unique challenges due to the intricate and variable nature of medical images compared to their natural counterparts. While Convolutional Neural Networks (CNNs) and Transformers are prevalent in this domain, each architecture has its drawbacks. CNNs, despite their strength in local feature extraction, fall short in capturing global context, whereas Transformers excel at global information but can overlook fine-grained details. The integration of CNNs and Transformers in a hybrid model aims to bridge this gap by enabling simultaneous local and global feature extraction. However, this approach remains constrained in its capacity to model long-range dependencies, thereby hindering the efficient extraction of distant features. To address these issues, we introduce the MambaConvT model, which employs a state-space approach. It begins by locally processing input features through multi-core convolution, enhancing the extraction of deep, discriminative local details. Next, depth-separable convolution with a 2D selective scanning module (SS2D) is employed to maintain a global receptive field and establish long-distance connections, capturing the fine-grained features. The model then combines hybrid features for comprehensive feature extraction, followed by global feature modeling to emphasize on global detail information and optimize feature representation. This paper conducts thorough performance experiments on different algorithms across four publicly available datasets and two private datasets. The results demonstrate that MambaConvT outperforms the latest classification algorithms in terms of accuracy, precision, recall, F1 score, and AUC value ratings, achieving superior performance in the precise classification of medical images.

Funder

Jiangsu Provincial Key Research and Development Program

Cao Ning

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/15/3094/pdf

Reference46 articles.

1. Wang, W., Liang, D., Chen, Q., Iwamoto, Y., Han, X.H., Zhang, Q., Hu, H., Lin, L., and Chen, Y.W. (2020). Medical image classification using deep learning. Deep Learning in Healthcare: Paradigms and Applications, Springer.

2. COVID-CT-MD, COVID-19 computed tomography scan dataset applicable in machine learning and deep learning;Afshar;Sci. Data,2021

3. An Overview of Deep Learning Techniques on Chest X-Ray and CT Scan Identification of COVID-19;Chuah;Comput. Math. Methods Med.,2021

4. Exploring task structure for brain tumor segmentation from multi-modality MR images;Zhang;IEEE Trans. Image Process.,2020

5. Cross-modality deep feature learning for brain tumor segmentation;Zhang;Pattern Recognit.,2021