1. Cascaded cross-modal transformer for audio–textual classification;Artificial Intelligence Review;2024-08-02
2. Cross-Modal Alignment for End-to-End Spoken Language Understanding Based on Momentum Contrastive Learning;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
3. Written Term Detection Improves Spoken Term Detection;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024
4. End-to-End Speech Recognition: A Survey;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024
5. Exploring the Viability of Synthetic Audio Data for Audio-Based Dialogue State Tracking;2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU);2023-12-16