1. The sound of pixels;zhao;Proceedings of the European Conference on Computer Vision (ECCV),0
2. Audio-visual scene analysis with self-supervised multisensory features;owens;Proceedings of the European Conference on Computer Vision (ECCV),0
3. Neural discrete representation learning;van den oord;ArXiv Preprint,2017
4. Neural Speech Synthesis with Transformer Network
5. Lipper: Synthesizing Thy Speech Using Multi-View Lipreading