Abstract
AbstractSynthetic electronic health records (EHRs) that are both realistic and privacy-preserving offer alternatives to real EHRs for machine learning (ML) and statistical analysis. However, generating high-fidelity EHR data in its original, high-dimensional form poses challenges for existing methods. We propose Hierarchical Autoregressive Language mOdel () for generating longitudinal, high-dimensional EHR, which preserve the statistical properties of real EHRs and can train accurate ML models without privacy concerns. generates a probability density function over medical codes, clinical visits, and patient records, allowing for generating realistic EHR data without requiring variable selection or aggregation. Extensive experiments demonstrated that can generate high-fidelity data with high-dimensional disease code probabilities closely mirroring (above 0.9 R2 correlation) real EHR data. also enhances the accuracy of predictive modeling and enables downstream ML models to attain similar accuracy as models trained on genuine data.
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry,Multidisciplinary
Reference47 articles.
1. Choi, E., Bahadori, M. T., Song, L., Stewart, W. F. & Sun, J. Gram: graph-based attention model for healthcare representation learning. In Proc. 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 787–795 (ACM, 2017).
2. Choi, E. et al. RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism. Adv. Neural Inf. Process. Syst. 29, 3504–3512 (2016).
3. Farrar, C. R. & Worden, K. Structural Health Monitoring: A Machine Learning Perspective (John Wiley & Sons, 2012).
4. Duggal, R., Freitas, S., Xiao, C., Chau, D. H. & Sun, J. Rest: robust and efficient neural networks for sleep monitoring in the wild. In Proceedings of The Web Conference 2020. 1704–1714 (ACM, 2020).
5. Fu, T., Hoang, T. N., Xiao, C. & Sun, J. Ddl: Deep dictionary learning for predictive phenotyping. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. (ed. Kraus, S.) 5857–5863 (International Joint Conferences on Artificial Intelligence Organization, 2019).
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献