A Short Survey on Deep Learning for Multimodal Integration: Applications, Future Perspectives and Challenges-Reference-Cited by-同舟云学术

A Short Survey on Deep Learning for Multimodal Integration: Applications, Future Perspectives and Challenges

Published:2022-11-18 Issue:11 Volume:11 Page:163
ISSN:2073-431X
Container-title:Computers
language:en
Short-container-title:Computers

Author:

Dimitri Giovanna Maria^ORCID

Abstract

Deep learning has achieved state-of-the-art performances in several research applications nowadays: from computer vision to bioinformatics, from object detection to image generation. In the context of such newly developed deep-learning approaches, we can define the concept of multimodality. The objective of this research field is to implement methodologies which can use several modalities as input features to perform predictions. In this, there is a strong analogy with respect to what happens with human cognition, since we rely on several different senses to make decisions. In this article, we present a short survey on multimodal integration using deep-learning methods. In a first instance, we comprehensively review the concept of multimodality, describing it from a two-dimensional perspective. First, we provide, in fact, a taxonomical description of the multimodality concept. Secondly, we define the second multimodality dimension as the one describing the fusion approaches in multimodal deep learning. Eventually, we describe four applications of multimodal deep learning to the following fields of research: speech recognition, sentiment analysis, forensic applications and image processing.

Publisher

MDPI AG

Subject

Computer Networks and Communications,Human-Computer Interaction

Link

https://www.mdpi.com/2073-431X/11/11/163/pdf

Reference75 articles.

1. Deep learning;Nature,2015

2. Color image segmentation: Advances and prospects;Pattern Recognit.,2001

3. Dimitri, G.M., Spasov, S., Duggento, A., Passamonti, L., and Toschi, N. (2020, January 20–24). Unsupervised stratification in neuroimaging through deep latent embeddings. Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada.

4. A survey on deep learning in medical image analysis;Med. Image Anal.,2017

5. Interactive alkaptonuria database: Investigating clinical data to improve patient care in a rare disease;FASEB J.,2019

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-modal lifelog data fusion for improved human activity recognition: A hybrid approach;Information Fusion;2024-10

2. ENFformer: Long-short term representation of electric network frequency for digital audio tampering detection;Knowledge-Based Systems;2024-08

3. A Review of Key Technologies for Emotion Analysis Using Multimodal Information;Cognitive Computation;2024-06-01

4. A twin convolutional neural network with hybrid binary optimizer for multimodal breast cancer digital image classification;Scientific Reports;2024-01-06

5. High-availability displacement sensing with multi-channel self mixing interferometry;Optics Express;2023-06-14