1. Single-layer vision transformers for more accurate early exits with less overhead;Bakhtiarnia;Neural Networks,2022
2. Swin-Unet: Unet-like pure transformer for medical image segmentation;Cao,2021
3. Transunet: Transformers make strong encoders for medical image segmentation;Chen,2021
4. Convit: Improving vision transformers with soft convolutional inductive biases;D’Ascoli,2021
5. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the international conference on learning representations.