1. LRS3-TED: a large-scale dataset for visual speech recognition;Afouras,2018
2. DRILL: Dynamic representations for imbalanced lifelong learning;Ahrens,2021
3. Improved recognition of contact names in voice commands;Aleksic,2015
4. Deep Speech 2: End-to-end speech recognition in English and Mandarin;Amodei,2016
5. Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In International conference on learning representations.