Abstract
Image captioning is a popular topic in the domains of computer vision and natural language processing (NLP). Recent advancements in deep learning (DL) models have enabled the improvement of the overall performance of the image captioning approach. This study develops a metaheuristic optimization with a deep learning-enabled automated image captioning technique (MODLE-AICT). The proposed MODLE-AICT model focuses on the generation of effective captions to the input images by using two processes involving encoding unit and decoding unit. Initially, at the encoding part, the salp swarm algorithm (SSA), with a HybridNet model, is utilized to generate effectual input image representation using fixed-length vectors, showing the novelty of the work. Moreover, the decoding part includes a bidirectional gated recurrent unit (BiGRU) model used to generate descriptive sentences. The inclusion of an SSA-based hyperparameter optimizer helps in attaining effectual performance. For inspecting the enhanced performance of the MODLE-AICT model, a series of simulations were carried out, and the results are examined under several aspects. The experimental values suggested the betterment of the MODLE-AICT model over recent approaches.
Funder
King Khalid University
Princess Nourah bint Abdulrahman University
Umm al-Qura University
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献