1. Attention is all you need;Vaswani,2017
2. Vision transformer models for mobile/edge devices: A survey;Lee;Multimedia Syst.,2024
3. Discrete cosine transformed images are easy to recognize in vision transformers;Lee;IEIE Trans. Smart Process. Comput.,2023
4. Robust speech recognition via large-scale weak supervision;Radford,2023
5. Conformer: Convolution-augmented transformer for speech recognition;Gulati,2020