Arabic Captioning for Images of Clothing Using Deep Learning
Author:
Al-Malki Rasha Saleh1ORCID, Al-Aama Arwa Yousuf1ORCID
Affiliation:
1. Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Abstract
Fashion is one of the many fields of application that image captioning is being used in. For e-commerce websites holding tens of thousands of images of clothing, automated item descriptions are quite desirable. This paper addresses captioning images of clothing in the Arabic language using deep learning. Image captioning systems are based on Computer Vision and Natural Language Processing techniques because visual and textual understanding is needed for these systems. Many approaches have been proposed to build such systems. The most widely used methods are deep learning methods which use the image model to analyze the visual content of the image, and the language model to generate the caption. Generating the caption in the English language using deep learning algorithms received great attention from many researchers in their research, but there is still a gap in generating the caption in the Arabic language because public datasets are often not available in the Arabic language. In this work, we created an Arabic dataset for captioning images of clothing which we named “ArabicFashionData” because this model is the first model for captioning images of clothing in the Arabic language. Moreover, we classified the attributes of the images of clothing and used them as inputs to the decoder of our image captioning model to enhance Arabic caption quality. In addition, we used the attention mechanism. Our approach achieved a BLEU-1 score of 88.52. The experiment findings are encouraging and suggest that, with a bigger dataset, the attributes-based image captioning model can achieve excellent results for Arabic image captioning.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference29 articles.
1. Dwivedi, P., and Upadhyaya, A. (2022, January 27–28). A Novel Deep Learning Model for Accurate Prediction of Image Captions in Fashion Industry. Proceedings of the Confluence 2022–12th International Conference on Cloud Computing, Data Science and Engineering, Noida, India. 2. Tran, K., He, X., Zhang, L., Sun, J., Carapcea, C., Thrasher, C., Buehler, C., and Sienkiewicz, C. (July, January 26). Rich Image Captioning in the Wild. Proceedings of the Ieee Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA. 3. Automatic Arabic Image Captioning Using RNN-LST M-Based Language Model and CNN;Benhidour;Int. J. Adv. Comput. Sci. Appl.,2018 4. A Review on the Attention Mechanism of Deep Learning;Niu;Neurocomputing,2021 5. Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. (2015, January 7–12). Show and Tell: A Neural Image Caption Generator. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|