1. Ambient sound provides supervision for visual learning;owens;European Conference on Computer Vision,2016
2. Soundnet: Learning sound representations from unlabeled video;aytar;Advances in neural information processing systems,2016
3. Least Squares Generative Adversarial Networks
4. AENet: Learning Deep Audio Features for Video Analysis
5. Hifigan: Generative adversarial networks for efficient and high fidelity speech synthesis;kong;Advances in neural information processing systems,2020