1. Jiang Q Y, Li W J. Deep cross-modal hashing[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 3232–3240.
2. Li C, Deng C, Li N, et al. Self-supervised adversarial hashing networks for cross-modal retrieval[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4242–4251.
3. Cycle-consistent deep generative hashing for cross-modal retrieval[J];Wu L;IEEE Transactions on Image Processing,2018
4. Deep cross-modal correlation learning for audio and lyrics in music retrieval[J];Yu Y;ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM),2019
5. Lou S, Xu X, Wu M, et al. Audio-Text Retrieval in Context[C]//ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022: 4793–4797.