Image caption generation using transfer learning-Reference-Cited by-同舟云学术

Image caption generation using transfer learning

Published:2023-10-30 Issue:15-16/2022 Volume:0 Page:5-10
ISSN:2450-0054
Container-title:Computer Science and Mathematical Modelling
language:
Short-container-title:

Author:

Kopiński Radosław¹,Antczak Karol¹^ORCID

Affiliation:

1. Wojskowa Akademia Techniczna Wydział Cybernetyki

Abstract

This paper describes an image caption generation system using deep neural networks. The model is trained to maximize the probability of generated sentence, given the image. The model utilizes transfer learning in the form of pretrained convolutional neural networks to preprocess the image data. The datasets are composed of a still photographs and associated with it, five captions in English language. Constructed model is compared to other similarly constructed models using BLEU score system and ways to further improve its performance are proposed.

Publisher

Index Copernicus

Subject

General Medicine

Reference13 articles.

1. Farhadi A., et al., “Every picture tells a Ssory: Generating sentences from images”, Computer Vision – ECCV 2010, LNCS 6314, pp. 15–29, Springer 2010.

2. Mitchell M., et al., “Midge: Generating image descriptions from computer vision detections”, in: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 747–756, April 2012.

3. Bai S., An S., “A survey on automatic image caption generation”, Neurocomputing, Vol. 311, 291–304 (2018).

4. Mikolov T., et al., "Efficient estimation of word representations in vector space”, arXiv preprint arXiv : 1301.3781, September 2013.

5. Tanti M., et al., “Where to put the image in an image caption generator”, Natural Language Engineering, Vol. 24(3), 467–489(2018).