1. Detection of glottal closure instants from speech signals: A quantitative review;drugman;IEEE Transactions on Audio Speech and Language Processing,2011
2. Joint robust voicing detection and pitch estimation based on residual harmonics;drugman;InterSpeech,2019
3. Audio-Visual Instance Discrimination with Cross-Modal Agreement
4. Scaling up visual and vision-language representation learning with noisy text supervision;jia;ArXiv Preprint,2021
5. Learning transferable visual models from natural language supervision;radford;ArXiv Preprint,2021