Effective Utilization of Pre-Trained Models in Ad-Hoc Video Search-Reference-Cited by-同舟云学术

Effective Utilization of Pre-Trained Models in Ad-Hoc Video Search

Published:2023-12-05 Issue:12 Volume:89 Page:926-933
ISSN:0912-0289
Container-title:Journal of the Japan Society for Precision Engineering
language:en
Short-container-title:Journal of the Japan Society for Precision Engineering

Author:

UEKI Kazuya¹,SUZUKI Yuma²,HORI Takayuki²,TAKUSHIMA Hiroki²,OKAMOTO Hideaki²,TANOUE Hayato²

Affiliation:

1. 明星大学情報学部情報学科

2. ソフトバンク株式会社

Publisher

Japan Society for Precision Engineering

Subject

Mechanical Engineering

Link

https://www.jstage.jst.go.jp/article/jjspe/89/12/89_926/_pdf

Reference27 articles.

1. 1) G. Awad, K. Curtis, A. A. Butt, J. Fiscus, A. Godil, Y. Lee, A. Delgado, J. Zhang, E. Godard, B. Chocot, L. Diduch, J. Liu, Y. Graham, G. Quénot, “An overview on the evaluated video retrieval tasks at TRECVID 2022,” In Proc. of TRECVID 2022, (2022).

2. 2) A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, M. Ranzato, T. Mikolov, “DeViSE: A Deep Visual-Semantic Embedding Model,” In Proc. of Advances in Neural Information Processing Systems (NIPS), 26, (2013).

3. 3) R. Kiros, R. Salakhutdinov, R. S. Zemel, “Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models,” In Proc. of NIPS 2014 Deep Learning Workshop, (2014).

4. 4) O. Vinyals, A. Toshev, S. Bengio, D. Erhan, “Show and Tell: A Neural Image Caption Generator,” In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015).

5. 5) A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever, “Learning Transferable Visual Models From Natural Language Supervision,” arXiv:2103.00020, (2021).