VerFormer: Vertebrae-Aware Transformer for Automatic Spine Segmentation from CT Images-Reference-Cited by-同舟云学术

VerFormer: Vertebrae-Aware Transformer for Automatic Spine Segmentation from CT Images

Published:2024-08-25 Issue:17 Volume:14 Page:1859
ISSN:2075-4418
Container-title:Diagnostics
language:en
Short-container-title:Diagnostics

Author:

Li Xinchen¹,Hong Yuan¹,Xu Yang¹,Hu Mu¹^ORCID

Affiliation:

1. Department of Orthopedics, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China

Abstract

The accurate and efficient segmentation of the spine is important in the diagnosis and treatment of spine malfunctions and fractures. However, it is still challenging because of large inter-vertebra variations in shape and cross-image localization of the spine. In previous methods, convolutional neural networks (CNNs) have been widely applied as a vision backbone to tackle this task. However, these methods are challenged in utilizing the global contextual information across the whole image for accurate spine segmentation because of the inherent locality of the convolution operation. Compared with CNNs, the Vision Transformer (ViT) has been proposed as another vision backbone with a high capacity to capture global contextual information. However, when the ViT is employed for spine segmentation, it treats all input tokens equally, including vertebrae-related tokens and non-vertebrae-related tokens. Additionally, it lacks the capability to locate regions of interest, thus lowering the accuracy of spine segmentation. To address this limitation, we propose a novel Vertebrae-aware Vision Transformer (VerFormer) for automatic spine segmentation from CT images. Our VerFormer is designed by incorporating a novel Vertebrae-aware Global (VG) block into the ViT backbone. In the VG block, the vertebrae-related global contextual information is extracted by a Vertebrae-aware Global Query (VGQ) module. Then, this information is incorporated into query tokens to highlight vertebrae-related tokens in the multi-head self-attention module. Thus, this VG block can leverage global contextual information to effectively and efficiently locate spines across the whole input, thus improving the segmentation accuracy of VerFormer. Driven by this design, the VerFormer demonstrates a solid capacity to capture more discriminative dependencies and vertebrae-related context in automatic spine segmentation. The experimental results on two spine CT segmentation tasks demonstrate the effectiveness of our VG block and the superiority of our VerFormer in spine segmentation. Compared with other popular CNN- or ViT-based segmentation models, our VerFormer shows superior segmentation accuracy and generalization.

Publisher

MDPI AG

Link

https://www.mdpi.com/2075-4418/14/17/1859/pdf

Reference54 articles.

1. Abnormal vertebral segmentation and the notch signaling pathway in man;Turnpenny;Dev. Dyn. Off. Publ. Am. Assoc. Anat.,2007

2. Najjar, R. (2023). Redefining radiology: A review of artificial intelligence integration in medical imaging. Diagnostics, 13.

3. Korez, R., Likar, B., Pernuš, F., and Vrtovec, T. (2016, January 17–21). Model-based segmentation of vertebral bodies from MR images with 3D CNNs. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Athens, Greece.

4. Sekuboyina, A., Rempfler, M., Kukačka, J., Tetteh, G., Valentinitsch, A., Kirschke, J.S., and Menze, B.H. (2018, January 16–20). Btrfly net: Vertebrae labelling with energy-based adversarial learning of local spine prior. Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain. Proceedings, Part IV 11.

5. Cheng, P., Yang, Y., Yu, H., and He, Y. (2021). Automatic vertebrae localization and segmentation in CT with a two-stage Dense-U-Net. Sci. Rep., 11.