Author:
Khorrami Khazar,Räsänen Okko
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding;2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU);2023-12-16
2. SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model;2022 IEEE Spoken Language Technology Workshop (SLT);2023-01-09
3. Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases;ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2022-05-23
4. Fast-Slow Transformer for Visually Grounding Speech;ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2022-05-23