Cross-Parallel Transformer: Parallel ViT for Medical Image Segmentation-Reference-Cited by-同舟云学术

Cross-Parallel Transformer: Parallel ViT for Medical Image Segmentation

Published:2023-11-29 Issue:23 Volume:23 Page:9488
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Wang Dong¹^ORCID,Wang Zixiang¹,Chen Ling¹,Xiao Hongfeng¹,Yang Bo¹

Affiliation:

1. College of Engineering and Design, Hunan Normal University, Changsha 410081, China

Abstract

Medical image segmentation primarily utilizes a hybrid model consisting of a Convolutional Neural Network and sequential Transformers. The latter leverage multi-head self-attention mechanisms to achieve comprehensive global context modelling. However, despite their success in semantic segmentation, the feature extraction process is inefficient and demands more computational resources, which hinders the network’s robustness. To address this issue, this study presents two innovative methods: PTransUNet (PT model) and C-PTransUNet (C-PT model). The C-PT module refines the Vision Transformer by substituting a sequential design with a parallel one. This boosts the feature extraction capabilities of Multi-Head Self-Attention via self-correlated feature attention and channel feature interaction, while also streamlining the Feed-Forward Network to lower computational demands. On the Synapse public dataset, the PT and C-PT models demonstrate improvements in DSC accuracy by 0.87% and 3.25%, respectively, in comparison with the baseline model. As for the parameter count and FLOPs, the PT model aligns with the baseline model. In contrast, the C-PT model shows a decrease in parameter count by 29% and FLOPs by 21.4% relative to the baseline model. The proposed segmentation models in this study exhibit benefits in both accuracy and efficiency.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/23/9488/pdf

Reference49 articles.

1. Long, J., Shelhamer, E., and Darrell, T. (2015, January 7–12). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.

2. Segnet: A deep convolutional encoder-decoder architecture for image segmentation;Badrinarayanan;IEEE Trans. Pattern Anal. Mach. Intell.,2017

3. Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5–9). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany.

4. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30.

5. Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., and Zhou, Y. (2021). Transunet: Transformers make strong encoders for medical image segmentation. arXiv.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. S-E Pipeline: A Vision Transformer (ViT) based Resilient Classification Pipeline for Medical Imaging Against Adversarial Attacks;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30

2. Correction: Wang et al. Cross-Parallel Transformer: Parallel ViT for Medical Image Segmentation. Sensors 2023, 23, 9488;Sensors;2024-01-17