1. Attention is all you need;vaswani;Proc NIPS,2017
2. Bert: Pre-training of deep bidirectional transformers for language understanding;devlin,2018
3. Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition
4. Towards data selection on tts data for children’s speech recognition;wang;Proc IEEE ICASSP,2021
5. An image is worth 16x16 words: Transformers for image recognition at scale;dosovitskiy,2020