1. Deep Audio-visual Speech Recognition
2. Triantafyllos Afouras, Joon Son Chung, and Andrew Zisserman. 2018. LRS3-TED: a large-scale dataset for visual speech recognition. CoRR abs/1809.00496 (2018). arXiv:1809.00496http://arxiv.org/abs/1809.00496
3. Bringing portraits to life
4. wav2vec 2.0: A framework for self-supervised learning of speech representations;Baevski Alexei;Advances in Neural Information Processing Systems,2020