Neutrosophic-CNN-based image and text fusion for multimodal classification

Author:

Wajid Mohd Anas1,Zafar Aasim2,Terashima-Marín Hugo3,Wajid Mohammad Saif3

Affiliation:

1. Department of Computer Science and Application, School of Engineering and Technology, Sharda University, Greater Noida, India

2. Department of Computer Science, Aligarh Muslim University, Civil Lines, Aligarh, Uttar Pradesh, India

3. School of Engineering and Sciences, Tecnológico de Monterrey, Monterrey, Mexico

Abstract

Recent advances in technology and devices have caused a data explosion on the Internet and on our home PCs. This data is predominantly obtained in various modalities (text, image, video, etc.) and is essential for e-commerce websites. The products on these websites have both images and descriptions in text form, making them multimodal in nature. Earlier categorization and information retrieval methods focused mostly on a single modality. This study employs multimodal data for classification using neutrosophic fuzzy sets for uncertainty management for information retrieval tasks. This effort utilizes image and text data and, inspired by past techniques of embedding text over an image, attempts to classify the images using neutrosophic classification algorithms. For classification tasks, Neutrosophic Convolutional Neural Networks (NCNNs) are used to learn feature representations of the produced images. We demonstrate how a pipeline based on NCNN can be utilized to learn representations of the innovative fusion method. Traditional convolutional neural networks are vulnerable to unknown noisy conditions in the test phase, and as a result, their performance for the classification of noisy data declines. Comparing our method against individual sources on two large-scale multi-modal categorization datasets yielded good results. In addition, we have compared our method to two well-known multi-modal fusion methodologies, namely early fusion and late fusion.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference21 articles.

1. Multimodal Fusion: A Review, Taxonomy, Open Challenges, Research Roadmap and Future Directions;Wajid;Neutrosophic Sets and Systems,2021

2. Multimodal machine learning: A survey and taxonomy;Baltrusaitis;IEEE Transactions on Pattern Analysis and Machine Intelligence,2018

3. An efficient image segmentation algorithm using neutrosophic graph cut;Guo;Symmetry,2017

4. Web image concept annotation with better understanding of tags and visual features;Gao;Journal of Visual Communication and Image Representation,2010

5. Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme;Liu;Computer Vision and Image Understanding,2013

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3