1. Seyedarmin Azizi Mahdi Nazemi and Massoud Pedram. 2024. Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy. arXiv:2402.06004 [cs.CV]
2. Yueyin Bai et al. 2023. LTrans-OPU: A Low-Latency FPGA-Based Overlay Processor for Transformer Networks. In 33rd International Conference on Field-Programmable Logic and Applications, FPL 2023. IEEE, 283--287.
3. Jia Deng et al. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
4. Alexey Dosovitskiy et al. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In 9th International Conference on Learning Representations.
5. Nazim Altar Koca et al. 2023. Hardware-efficient Softmax Approximation for Self-Attention Networks. In IEEE International Symposium on Circuits and Systems.