1. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text;Akbari,2021
2. Arnab, A., Dehghani, M., Heigold, G., Sun, C., Lučić, M., Schmid, C., 2021. Vivit: A video vision transformer. In: International Conference on Computer Vision (ICCV). pp. 6836–6846.
3. The RSNA-ASNR-MICCAI braTS 2021 benchmark on brain tumor segmentation and radiogenomic classification;Baid,2021
4. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features;Bakas;Sci. Data,2017
5. Bao, H., Dong, L., Wei, F., 2022. BEiT: BERT Pre-Training of Image Transformers. In: International Conference on Learning Representations (ICLR).