Affiliation:
1. School of Computer Science and Technology, University of Science and Technology of China, China
2. Software Engineering, University of Science and Technology of China, China
Funder
Jiangsu Provincial Natural Science Foundation
Youth Innovation Promotion Association of the Chinese Academy of Sciences
National Key Research and Development Program of China
National Natural Science Foundation of China
Reference20 articles.
1. Tri Dao 2022. Flashattention: Fast and memory-efficient exact attention with io-awareness. In NeurIPS. 16344–16359.
2. Jacob Devlin 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
3. Alexey Dosovitskiy 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
4. Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
5. An algorithm–hardware co-optimized framework for accelerating n: M sparse transformers;Fang Chao;TVLSI,2022