1. Triantafyllos Afouras, Joon Son Chung, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. 2018. Deep Audio-visual Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018), 1--1.
2. Performer vs. observer
3. Crowdsourcing vs Laboratory-Style Social Acceptability Studies?
4. Yannis M Assael, Brendan Shillingford, Shimon Whiteson, and Nando De Freitas. 2016. Lipnet: End-to-end sentence-level lipreading. arXiv preprint arXiv:1611.01599 (2016).
5. Unsupervised speech recognition;Baevski Alexei;Advances in Neural Information Processing Systems,2021