Incorporating external knowledge for image captioning using CNN and LSTM-Reference-Cited by-同舟云学术

Incorporating external knowledge for image captioning using CNN and LSTM

Published:2020-07-16 Issue:28 Volume:34 Page:2050315
ISSN:0217-9849
Container-title:Modern Physics Letters B
language:en
Short-container-title:Mod. Phys. Lett. B

Author:

Sharma Himanshu¹^ORCID,Jalal Anand Singh¹

Affiliation:

1. Department of Computer Engineering and Applications, GLA University Mathura, Uttar Pradesh 281406, India

Abstract

Image captioning is a multidisciplinary artificial intelligence (AI) research task that has captures the interest of both image and natural language processing experts. Image captioning is a complex problem as it sometimes requires accessing the information that may not be directly visualized in a given scene. It possibly will require common sense interpretation or the detailed knowledge about the object present in image. In this paper, we have given a method that utilizes both visual and external knowledge from knowledge bases such as ConceptNet for better description the images. We demonstrated the usefulness of the method on two publicly available datasets; Flickr8k and Flickr30k.The results explain that the proposed model outperforms the state-of-the art approaches for generating image captions. At last, we will talk about possible future prospects in image captioning.

Publisher

World Scientific Pub Co Pte Lt

Subject

Condensed Matter Physics,Statistical and Nonlinear Physics

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0217984920503157

Reference45 articles.

1. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

2. Going deeper with convolutions

3. ConceptNet — A Practical Commonsense Reasoning Tool-Kit

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advanced Generative Deep Learning Techniques for Accurate Captioning of Images;Wireless Personal Communications;2024-04-29

2. Enhancing scene‐text visual question answering with relational reasoning, attention and dynamic vocabulary integration;Computational Intelligence;2024-02

3. A Novel Image Captioning Approach Using CNN and MLP;Lecture Notes in Networks and Systems;2024

4. Research and Application of Image Captioning Based on the CNN-LSTM Method;2023 IEEE International Conference on Electrical, Automation and Computer Engineering (ICEACE);2023-12-29

5. Enhancing visual question answering with a two‐way co‐attention mechanism and integrated multimodal features;Computational Intelligence;2023-12-21