Affiliation:
1. Massachusetts Institute of Technology
2. University of Michigan
3. Universidad de Chile
4. Universidad del Cauca
Abstract
Abstract
In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating embeddings and the Cross-Industry Standard Process for Data Mining with the existing Data Fusion Information Group model. Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability. We also propose "disentangled dense fusion," a novel embedding fusion method designed to optimize mutual information and facilitate dense inter-modality feature interaction, thereby minimizing redundant information.We demonstrate the model's efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes. The model achieved a Macro F1 score of 0.92 in diabetic retinopathy prediction, an R-squared of 0.854 and sMAPE of 24.868 in domestic violence prediction, and a macro AUC of 0.92 and 0.99 for disease prediction and sex classification, respectively, in radiological analysis.
These results underscore the Data Fusion for Data Mining model's potential to significantly impact multimodal data processing, promoting its adoption in diverse, resource-constrained settings.
Publisher
Research Square Platform LLC
Reference91 articles.
1. Goodwin, Phil Tape and Cloud: Solving Storage Problems in the Zettabyte Era o f Data. 2019, {IDC} Corporate, Massachusetts, United States
2. Pan, Indranil and Mason, Lachlan R. and Matar, Omar K. Data-centric Engineering: integrating simulation, machine learning and statistics. Challenges and opportunities. 249: 117271 https://doi.org/https://doi.org/10.1016/j.ces.2021.117271, Artificial Intelligence, {CFD}, Data-centric Engineering, Digital twins, {FEM}, {SimOps}, 2022, Chemical Engineering Science, Recent advances in machine learning, coupled with low-cost computation, availability of cheap streaming sensors, data storage and cloud technologies, has led to widespread multi-disciplinary research activity with significant interest and investment from commercial stakeholders. Mechanistic models, based on physical equations, and purely data-driven statistical approaches represent two ends of the modelling spectrum. New hybrid, data-centric engineering approaches, leveraging the best of both worlds and integrating both simulations and data, are emerging as a powerful tool with a transformative impact on the physical disciplines. We review the key research trends and application scenarios in the emerging field of integrating simulations, machine learning, and statistics. We highlight the opportunities that such an integrated vision can unlock and outline the key challenges holding back its realisation. We also discuss the bottlenecks in the translational aspects of the field and the long-term upskilling requirements for the existing workforce and future university graduates., 0009-2509
3. Furman, Jason and Seamans, Robert {AI} and the Economy. 19: 161--191 https://doi.org/10.1086/699936, \_eprint: https://doi.org/10.1086/699936, 2019, Innovation Policy and the Economy, Executive {SummaryWe} review the evidence that artificial intelligence ({AI}) is having a large effect on the economy. Across a variety of statistics —including robotics shipments, {AI} start-ups, and patent counts —there is evidence of a large increase in {AI}-related activity. We also review recent research in this area that suggests that {AI} and robotics have the potential to increase productivity growth but may have mixed effects on labor, particularly in the short run. In particular, some occupations and industries may do well while others experience labor market upheaval. We then consider current and potential policies around {AI} that may help to boost productivity growth while also mitigating any labor market downsides, including evaluating the pros and cons of an {AI} specific regulator, expanded antitrust enforcement, and alternative strategies for dealing with the labor market impacts of {AI}, including universal basic income and guaranteed employment.
4. Shaik, Thanveer and Tao, Xiaohui and Li, Lin and Xie, Haoran and Vel ásquez, Juan D. A survey of multimodal information fusion for smart healthcare: Mapping the journey from data to wisdom. 102: 102040 https://doi.org/https://doi.org/10.1016/j.inffus.2023.102040, Data fusion, Multimodality, {DIKW}, p4 medicine, Smart healthcare, 2024, Information Fusion, Multimodal medical data fusion has emerged as a transformative approach in smart healthcare, enabling a comprehensive understanding of patient health and personalized treatment plans. In this paper, a journey from data to information to knowledge to wisdom ({DIKW}) is explored through multimodal fusion for smart healthcare. We present a comprehensive review of multimodal medical data fusion focused on the integration of various data modalities. The review explores different approaches such as feature selection, rule-based systems, machine ;earning, deep learning, and natural language processing, for fusing and analyzing multimodal data. This paper also highlights the challenges associated with multimodal fusion in healthcare. By synthesizing the reviewed frameworks and theories, it proposes a generic framework for multimodal medical data fusion that aligns with the {DIKW} model. Moreover, it discusses future directions related to the four pillars of healthcare: Predictive, Preventive, Personalized, and Participatory approaches. The components of the comprehensive survey presented in this paper form the foundation for more successful implementation of multimodal fusion in smart healthcare. Our findings can guide researchers and practitioners in leveraging the power of multimodal fusion with the state-of-the-art approaches to revolutionize healthcare and improve patient outcomes., 1566-2535
5. Ma, Danqing and Dang, Bo and Li, Shaojie and Zang, Hengyi and Dong, Xinqi (2023) Implementation of computer vision technology based on artificial intelligence for medical image analysis. International Journal of Computer Science and Information Technology 1(1): 69--76
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献