1. Liu Z. , Lin Y. , Cao Y. , Hu H. , Wei Y. , Zhang Z. , Lin S. and Guo B. Swin transformer: Hierarchical vision transformer using shifted windows, International Conference on Computer Vision (ICCV), 2021.
2. Han K. , Xiao A. , Wu E. , Guo J. , Xu C. and Wang Y. Transformer in transformer, ArXiv, abs/2103.00112, 2021.
3. Developing real-time streaming transformer transducer for speech recognition on large-scale dataset;Chen;ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),2021
4. Vaswani A. , Shazeer N. , Parmar N. , Uszkoreit J. , Jones L. , Gomez A.N. , Kaiser L. and Polosukhin I. Attention is all you need. In Guyon Isabelle, von Luxburg Ulrike, Bengio Samy, Wallach Hanna M., Fergus Rob, Vishwanathan S.V.N. and Garnett Roman, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, (2017), pp. 5998–6008.
5. BERT: Pre-training of deep bidirectional transformers for language understanding;Devlin;Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,2019