1. 1) G. Awad, A. Butt, J. Fiscus, D. Joy, A. Delgado, M. Michel, A. F. Smeaton, Y. Graham, W. Kraaij, G. Quénot, M. Eskevich, R. Ordelman, G. J. F. Jones, and B. Huet, “Trecvid 2017: Evaluating ad-hoc and instance video search, events detection, video captioning and hyperlinking,” In Proc. of TRECVID 2017, (2017).
2. 2) G. Awad, C. G. M. Snoek, A. F. Smeaton, and G. Quénot, “TRECVid Semantic Indexing of Video: A 6-Year Retrospective,” ITE Trans. on MTA 4, 3, (2016) 187.
3. 3) C. G. M. Snoek, S. Cappallo, D. Fontijne, D. Julian, D. C. Koelma, P. Mettes, K. E. A van de Sande, A. Sarah, H. Stokman, and R. B. Towal, “Qualcomm Research and University of Amsterdam at TRECVID 2015: Recognizing concepts, objects, and events in video,“ In Proc. of TRECVID 2015, (2015).
4. 4) K. Ueki, and T. Kobayashi, “Waseda at TRECVID 2015: Semantic indexing,” In Proc. of TRECVID 2015, (2015).
5. 5) J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database,” In Proc. of IEEE conference on Computer Vision and Pattern Recognition (CVPR), (2009).