Author:
Paraschakis Dimitris,Ros Rasmus,Borg Markus,Runeson Per
Publisher
Springer Nature Switzerland
Reference9 articles.
1. Crouch, C.J., Crouch, D.B., Nareddy, K.R.: The automatic generation of extended queries. In: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 369–383 (1989)
2. Hayes, T., et al.: MUGEN: a playground for video-audio-text multimodal understanding and GENeration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13668, pp. 431–449. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20074-8_25
3. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning (2021)
4. Li, J., Li, D., Xiong, C., Hoi, S.: BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation. In: International Conference on Machine Learning (2022)
5. Pawłowski, M., Wróblewska, A., Sysko-Romańczuk, S.: Effective techniques for multimodal data fusion: a comparative analysis. Sensors 23(5), 2381 (2023)