Image Caption Generator Using CNN and LSTM-Reference-Cited by-同舟云学术

Image Caption Generator Using CNN and LSTM

Published:2024-09-02 Issue: Volume: Page:1375-1382
ISSN:2456-2165
Container-title:International Journal of Innovative Science and Research Technology (IJISRT)
language:en
Short-container-title:International Journal of Innovative Science and Research Technology (IJISRT)

Author:

Kapuriya Monali,Lakkad Zemi,Shah Satwi

Abstract

In this have a look at, we discover the integration of Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks for the motive of image caption generation, a mission that involves a fusion of herbal language processing and computer imaginative and prescient techniques to describe images in English. Delving into the realm of photograph captioning, we meticulously investigate several fundamental concepts and methodologies associated with this area. Our technique includes leveraging prominent equipment inclusive of the Keras library, numpy, and Jupyter notebooks to facilitate the development of our studies. Furthermore, we delve into the utilization of the flickr_dataset and CNNs for image category, elucidating their significance in our examination. Through this research endeavor, we aim to make a contribution to the development of image captioning structures with the aid of combining modern-day strategies from both laptop imaginative and prescient and herbal language processing domain names.

Publisher

International Journal of Innovative Science and Research Technology

Reference54 articles.

1. Abhaya Agarwal and Alon Lavie. 2008. Meteor, m-bleu and m-ter: Evaluation metrics for high-correlation with human rankings of machine translation output. In Proceedings of the ThirdWorkshop on Statistical Machine Translation. Association for Computational Linguistics, 115–118.

2. Ahmet Aker and Robert Gaizauskas. 2010. Generating image descriptions using dependency relational patterns. In Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics, 1250–1258.

3. Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould. 2016. Spice: Semantic propositional image caption evaluation. In European Conference on Computer Vision. Springer, 382–398.

4. Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2017. Bottom-up and top-down attention for image captioning and vqa. arXiv preprint arXiv:1707.07998 (2017).

5. Jyoti Aneja, Aditya Deshpande, and Alexander G Schwing. 2018. Convolutional image captioning.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5561–5570.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Investigating Data Protection Compliance Challenges;International Journal of Innovative Science and Research Technology (IJISRT);2024-09-11