DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era

Author:

Restrepo David1,Wu Chenwei2,Vásquez-Venegas Constanza3,Nakayama Luis Filipe1,Celi Leo Anthony1,López Diego M4

Affiliation:

1. Massachusetts Institute of Technology

2. University of Michigan

3. Universidad de Chile

4. Universidad del Cauca

Abstract

Abstract In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating embeddings and the Cross-Industry Standard Process for Data Mining with the existing Data Fusion Information Group model. Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability. We also propose "disentangled dense fusion," a novel embedding fusion method designed to optimize mutual information and facilitate dense inter-modality feature interaction, thereby minimizing redundant information.We demonstrate the model's efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes. The model achieved a Macro F1 score of 0.92 in diabetic retinopathy prediction, an R-squared of 0.854 and sMAPE of 24.868 in domestic violence prediction, and a macro AUC of 0.92 and 0.99 for disease prediction and sex classification, respectively, in radiological analysis. These results underscore the Data Fusion for Data Mining model's potential to significantly impact multimodal data processing, promoting its adoption in diverse, resource-constrained settings.

Publisher

Research Square Platform LLC

Reference91 articles.

1. Goodwin, Phil Tape and Cloud: Solving Storage Problems in the Zettabyte Era o f Data. 2019, {IDC} Corporate, Massachusetts, United States

2. Pan, Indranil and Mason, Lachlan R. and Matar, Omar K. Data-centric Engineering: integrating simulation, machine learning and statistics. Challenges and opportunities. 249: 117271 https://doi.org/https://doi.org/10.1016/j.ces.2021.117271, Artificial Intelligence, {CFD}, Data-centric Engineering, Digital twins, {FEM}, {SimOps}, 2022, Chemical Engineering Science, Recent advances in machine learning, coupled with low-cost computation, availability of cheap streaming sensors, data storage and cloud technologies, has led to widespread multi-disciplinary research activity with significant interest and investment from commercial stakeholders. Mechanistic models, based on physical equations, and purely data-driven statistical approaches represent two ends of the modelling spectrum. New hybrid, data-centric engineering approaches, leveraging the best of both worlds and integrating both simulations and data, are emerging as a powerful tool with a transformative impact on the physical disciplines. We review the key research trends and application scenarios in the emerging field of integrating simulations, machine learning, and statistics. We highlight the opportunities that such an integrated vision can unlock and outline the key challenges holding back its realisation. We also discuss the bottlenecks in the translational aspects of the field and the long-term upskilling requirements for the existing workforce and future university graduates., 0009-2509

3. Furman, Jason and Seamans, Robert {AI} and the Economy. 19: 161--191 https://doi.org/10.1086/699936, \_eprint: https://doi.org/10.1086/699936, 2019, Innovation Policy and the Economy, Executive {SummaryWe} review the evidence that artificial intelligence ({AI}) is having a large effect on the economy. Across a variety of statistics —including robotics shipments, {AI} start-ups, and patent counts —there is evidence of a large increase in {AI}-related activity. We also review recent research in this area that suggests that {AI} and robotics have the potential to increase productivity growth but may have mixed effects on labor, particularly in the short run. In particular, some occupations and industries may do well while others experience labor market upheaval. We then consider current and potential policies around {AI} that may help to boost productivity growth while also mitigating any labor market downsides, including evaluating the pros and cons of an {AI} specific regulator, expanded antitrust enforcement, and alternative strategies for dealing with the labor market impacts of {AI}, including universal basic income and guaranteed employment.

4. Shaik, Thanveer and Tao, Xiaohui and Li, Lin and Xie, Haoran and Vel ásquez, Juan D. A survey of multimodal information fusion for smart healthcare: Mapping the journey from data to wisdom. 102: 102040 https://doi.org/https://doi.org/10.1016/j.inffus.2023.102040, Data fusion, Multimodality, {DIKW}, p4 medicine, Smart healthcare, 2024, Information Fusion, Multimodal medical data fusion has emerged as a transformative approach in smart healthcare, enabling a comprehensive understanding of patient health and personalized treatment plans. In this paper, a journey from data to information to knowledge to wisdom ({DIKW}) is explored through multimodal fusion for smart healthcare. We present a comprehensive review of multimodal medical data fusion focused on the integration of various data modalities. The review explores different approaches such as feature selection, rule-based systems, machine ;earning, deep learning, and natural language processing, for fusing and analyzing multimodal data. This paper also highlights the challenges associated with multimodal fusion in healthcare. By synthesizing the reviewed frameworks and theories, it proposes a generic framework for multimodal medical data fusion that aligns with the {DIKW} model. Moreover, it discusses future directions related to the four pillars of healthcare: Predictive, Preventive, Personalized, and Participatory approaches. The components of the comprehensive survey presented in this paper form the foundation for more successful implementation of multimodal fusion in smart healthcare. Our findings can guide researchers and practitioners in leveraging the power of multimodal fusion with the state-of-the-art approaches to revolutionize healthcare and improve patient outcomes., 1566-2535

5. Ma, Danqing and Dang, Bo and Li, Shaojie and Zang, Hengyi and Dong, Xinqi (2023) Implementation of computer vision technology based on artificial intelligence for medical image analysis. International Journal of Computer Science and Information Technology 1(1): 69--76

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Enhancing Energy Efficiency in Green Buildings through Artificial Intelligence;Frontiers in Science and Engineering;2024-08-21

2. A multimodal framework for extraction and fusion of satellite images and public health data;Scientific Data;2024-06-15

3. Enhancing Medical Imaging with GANs Synthesizing Realistic Images from Limited Data;2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI);2024-05-24

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3