1. On Metric Learning for Audio-Text Cross-Modal Retrieval
2. Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification
3. Take it easy: Relaxing contrastive ranking loss with cider;de gail;Tech Rep DCASE2022 Challenge Tech Rep,2022
4. A simple framework for contrastive learning of visual representations;chen;International Conference on Machine Learning,2020
5. Audiocaps: Generating captions for audios in the wild;kim;Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies,2019