Research on the Image Description Algorithm of Double-Layer LSTM Based on Adaptive Attention Mechanism-Reference-Cited by-同舟云学术

Research on the Image Description Algorithm of Double-Layer LSTM Based on Adaptive Attention Mechanism

Published:2022-05-21 Issue: Volume:2022 Page:1-9
ISSN:1563-5147
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Qin Cifeng¹,Gong Wenyin¹,Li Xiang¹^ORCID

Affiliation:

1. School of Computer Science, China University of Geosciences, Wuhan 430078, China

Abstract

Image text description is a multimodal data processing problem in the computer field, which involves the research tasks of computer vision and natural language processing. At present, the research focus of image text description task is mainly on the method based on deep learning. The work of this paper is mainly focused on the imprecise description of visual words and nonvisual words in the description of image description tasks in the image text description. An adaptive attention double-layer LSTM (long short-term memory) model based on coding-decoding is proposed. Compared with the algorithm based on the adaptive attention mechanism based on the coding-decoding framework, the evaluation index BLEU-1 is improved by 1.21%. The METEOR was 0.75% higher and CIDEr was 0.55%, while the indexes of BLEU-4 and ROUGE-L were not as good as those of the original model, but the index was not different. Although it cannot surpass all the performance indicators of the original model, the description of visual words and nonvisual words is more accurate in the actual image text description.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2022/2315341.pdf

Reference20 articles.

1. Synthetic data for text localisation in natural images;A. Gupta

2. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition;B. Shi;IEEE Transactions on Pattern Analysis and Machine Intelligence,2016

3. Text Detection and Recognition in Imagery: A Survey

4. Overview of vision based object detection and tracking;H. Yin;Acta Automatica Sinica,2016

5. A systematic evaluation of the bag-of-frames representation for music information retrieval;S. Li;IEEE Transactions on Multimedia,2014