Abstract
Image captioning is an automatic process for generating text based on the content observed in an image. We do review, create framework, and build application model. We review image captioning into 4 categories based on input model, process model, output model, and lingual image caption. Input model is based on criteria caption, method, and dataset. Process model is based on type of learning, encoder-decoder, image extractor, and metric evaluation. Output model based on architecture, features extraction, feature aping, model, and number of caption. Lingual image caption based on language model with 2 groups: bilingual image caption and cross-language image caption. We also design framework with 3 framework model. Furthermore, we also build application with 3 application models. We also provide research opinions on trends and future research that can be developed with image caption generation. Image captioning can be further developed on computer vision versus human vision.
Publisher
Fakultas Ilmu Komputer Universitas Brawijaya
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献