Affiliation:
1. Xi'an Institute of Optics and Precision Mechanics Chinese Academy of Sciences Xi'an Shaanxi China
2. School of Optoelectronics University of Chinese Academy of Sciences Beijing China
3. Key Laboratory of Space Precision Measurement Technology Xi'an Shaanxi China
Abstract
AbstractMedical image segmentation remains particularly challenging for complex and low‐contrast anatomical structures, especially in brain MRI glioma segmentation. Gliomas appear with extensive heterogeneity in appearance and location on brain MR images, making robust tumour segmentation extremely challenging and leads to highly variable even in manual segmentation. U‐Net has become the de facto standard in medical image segmentation tasks with great success. Previous researches have proposed various U‐Net‐based 2D Convolutional Neural Networks (2D‐CNN) and their 3D variants, called 3D‐CNN‐based architectures, for capturing contextual information. However, U‐Net often has limitations in explicitly modelling long‐term dependencies due to the inherent locality of convolution operations. Inspired by the recent success of natural language processing transformers in long‐range sequence learning, a multi‐view 2D U‐Nets with transformer (TransMVU) method is proposed, which combines the advantages of transformer and 2D U‐Net. On the one hand, the transformer encodes the tokenized image patches in the CNN feature map into an input sequence for extracting global context for global feature modelling. On the other hand, multi‐view 2D U‐Nets can provide accurate segmentation with fewer parameters than 3D networks. Experimental results on the BraTS20 dataset demonstrate that our model outperforms state‐of‐the‐art 2D models and classic 3D model.
Publisher
Institution of Engineering and Technology (IET)
Subject
Electrical and Electronic Engineering,Computer Vision and Pattern Recognition,Signal Processing,Software
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献