MCV-UNet: a modified convolution & transformer hybrid encoder-decoder network with multi-scale information fusion for ultrasound image semantic segmentation-Reference-Cited by-同舟云学术

MCV-UNet: a modified convolution & transformer hybrid encoder-decoder network with multi-scale information fusion for ultrasound image semantic segmentation

Published:2024-06-24 Issue: Volume:10 Page:e2146
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Xu Zihong¹,Wang Ziyang²

Affiliation:

1. Department of Mechanical Engineering, Columbia University, New York, United States of America

2. Department of Computer Science, University of Oxford, Oxford, United Kingdom

Abstract

In recent years, the growing importance of accurate semantic segmentation in ultrasound images has led to numerous advances in deep learning-based techniques. In this article, we introduce a novel hybrid network that synergistically combines convolutional neural networks (CNN) and Vision Transformers (ViT) for ultrasound image semantic segmentation. Our primary contribution is the incorporation of multi-scale CNN in both the encoder and decoder stages, enhancing feature learning capabilities across multiple scales. Further, the bottleneck of the network leverages the ViT to capture long-range high-dimension spatial dependencies, a critical factor often overlooked in conventional CNN-based approaches. We conducted extensive experiments using a public benchmark ultrasound nerve segmentation dataset. Our proposed method was benchmarked against 17 existing baseline methods, and the results underscored its superiority, as it outperformed all competing methods including a 4.6% improvement of Dice compared against TransUNet, 13.0% improvement of Dice against Attention UNet, 10.5% improvement of precision compared against UNet. This research offers significant potential for real-world applications in medical imaging, demonstrating the power of blending CNN and ViT in a unified framework.

Publisher

PeerJ

Link

https://peerj.com/articles/cs-2146.pdf

Reference79 articles.

1. TensorFlow: large-scale machine learning on heterogeneous systems;Abadi,2015

2. Artificial intelligence–based methods for integrating local and global features for brain cancer imaging: scoping review;Ali;JMIR Medical Informatics,2023

3. Recurrent residual convolutional neural network based on U-Net (R2U-Net) for medical image segmentation;Alom,2018

4. Dense-PSP-UNet: a neural network for fast inference liver ultrasound segmentation;Ansari;Computers in Biology and Medicine,2023

5. Ultrasound Nerve Segmentation. Kaggle;Anna Montoya,2016