1. Awad, G., et al.: Trecvid 2017: evaluating ad-hoc and instance video search, events detection, video captioning and hyperlinking. In: Proceedings of TRECVID 2017. NIST, USA (2017)
2. Deng, J., Dong, W., Socher, R., jia Li, L., Li, K., Fei-fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
3. Inoue, N., Shinoda, K.: Semantic indexing for large-scale video retrieval. ITE Trans. Media Technol. Appl. 4(3), 209–217 (2016). https://doi.org/10.3169/mta.4.209
4. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS 2013, pp. 3111–3119. Curran Associates Inc., USA (2013)
5. Miller, G.A., Fellbaum, C.: Wordnet then and now. Lang. Resour. Eval. 41(2), 209–214 (2007). https://doi.org/10.1007/s10579-007-9044-6