Data management for continuous learning in EHR systems

Author:

Bellandi Valerio1ORCID,Ceravolo Paolo2ORCID,Maggesi Jonatan1ORCID,Maghool Samira3ORCID

Affiliation:

1. Department of Computer Science, Università degli Studi di Milano, Milano, Italy

2. Computer Science, Università degli Studi di Milano, Milano, Italy

3. Computer Science, University of Milan, Milano, Italy

Abstract

To gain a comprehensive understanding of a patient’s health, advanced analytics must be applied to the data collected by electronic health record (EHR) systems. However, managing and curating this data requires carefully designed workflows. While digitalization and standardization enable continuous health monitoring, missing data values and technical issues can compromise the consistency and timeliness of the data. In this paper, we propose a workflow for developing prognostic models that leverages the SMART BEAR infrastructure and the capabilities of the Big Data Analytics (BDA) engine to homogenize and harmonize data points. Our workflow improves the quality of the data by evaluating different imputation algorithms and selecting one that maintains the distribution and correlation of features similar to the raw data. We applied this workflow to a subset of the data stored in the SMART BEAR repository and examined its impact on the prediction of emerging health states such as cardiovascular disease and mild depression. We also discussed the possibility of model validation by clinicians in the SMART BEAR project, the transmission of subsequent actions in the decision support system, and the estimation of the required number of data points.

Publisher

Association for Computing Machinery (ACM)

Reference42 articles.

1. C Agostinho, A Pimenta, M Marques, KM Tsiouris, F Kalatzis, C Nikitas, E Iliadou, M Occhipinti, I Kouris, D Koutsouris, et al. 2022. Healthier and Independent Living of the Elderly: Interoperability in a Cross-Project Pilot. In CEUR Workshop Proceedings. CEUR, 1–4.

2. Marco Anisetti, Claudio A. Ardagna, and Nicola Bena. 2023. Multi-Dimensional Certification of Modern Distributed Systems. IEEE TSC 16, 3 (2023).

3. Marco Anisetti, Claudio A. Ardagna, Nicola Bena, and Ernesto Damiani. 2023. Rethinking Certification for Trustworthy Machine-Learning-Based Applications. IEEE IC 27, 6 (2023).

4. Claudio A. Ardagna and Nicola Bena. 2023. Non-Functional Certification of Modern Distributed Systems: A Research Manifesto. In Proc. of IEEE SSE 2023. Chicago, IL, USA.

5. Michael Armbrust, Ali Ghodsi, Reynold Xin, and Matei Zaharia. 2021. Lakehouse: a new generation of open platforms that unify data warehousing and advanced analytics. In Proceedings of CIDR, Vol.  8.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3