Prompt-Based Tuning of Transformer Models for Multi-Center Medical Image Segmentation of Head and Neck Cancer-Reference-Cited by-同舟云学术

Prompt-Based Tuning of Transformer Models for Multi-Center Medical Image Segmentation of Head and Neck Cancer

Published:2023-07-24 Issue:7 Volume:10 Page:879
ISSN:2306-5354
Container-title:Bioengineering
language:en
Short-container-title:Bioengineering

Author:

Saeed Numan¹^ORCID,Ridzuan Muhammad¹^ORCID,Majzoub Roba Al²^ORCID,Yaqub Mohammad²^ORCID

Affiliation:

1. Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi 7909, United Arab Emirates

2. Department of Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi 7909, United Arab Emirates

Abstract

Medical image segmentation is a vital healthcare endeavor requiring precise and efficient models for appropriate diagnosis and treatment. Vision transformer (ViT)-based segmentation models have shown great performance in accomplishing this task. However, to build a powerful backbone, the self-attention block of ViT requires large-scale pre-training data. The present method of modifying pre-trained models entails updating all or some of the backbone parameters. This paper proposes a novel fine-tuning strategy for adapting a pretrained transformer-based segmentation model on data from a new medical center. This method introduces a small number of learnable parameters, termed prompts, into the input space (less than 1% of model parameters) while keeping the rest of the model parameters frozen. Extensive studies employing data from new unseen medical centers show that the prompt-based fine-tuning of medical segmentation models provides excellent performance regarding the new-center data with a negligible drop regarding the old centers. Additionally, our strategy delivers great accuracy with minimum re-training on new-center data, significantly decreasing the computational and time costs of fine-tuning pre-trained models. Our source code will be made publicly available.

Funder

MBZUAI

Publisher

MDPI AG

Subject

Bioengineering

Link

https://www.mdpi.com/2306-5354/10/7/879/pdf

Reference25 articles.

1. Efficient 3D Deep Learning Model for Medical Image Semantic Segmentation;Alalwan;Alex. Eng. J.,2021

2. de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., and Essert, C. (2021). Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Strasbourg, France, 27 September–1 October 2021, Springer International Publishing.

3. Zhou, H., Guo, J., Zhang, Y., Yu, L., Wang, L., and Yu, Y. (2021). nnFormer: Interleaved Transformer for Volumetric Segmentation. arXiv.

4. Attention is All You Need;Vaswani;Adv. Neural Inf. Process. Syst.,2017

5. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Vision transformer promotes cancer diagnosis: A comprehensive review;Expert Systems with Applications;2024-10

2. Understanding the brain with attention: A survey of transformers in brain sciences;Brain‐X;2023-09