ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations-Reference-Cited by-同舟云学术

ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations

Published:2024-08-07 Issue:1 Volume:14 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Wang Zeqiang,Wang Yuqi,Zhang Haiyang,Wang Wei,Qi Jun,Chen Jianjun,Sastry Nishanth,Johnson Jon,De Suparna^ORCID

Abstract

AbstractAccurately assigning standardized diagnosis and procedure codes from clinical text is crucial for healthcare applications. However, this remains challenging due to the complexity of medical language. This paper proposes a novel model that incorporates extreme multi-label classification tasks to enhance International Classification of Diseases (ICD) coding. The model utilizes deformable convolutional neural networks to fuse representations from hidden layer outputs of pre-trained language models and external medical knowledge embeddings fused using a multimodal approach to provide rich semantic encodings for each code. A probabilistic label tree is constructed based on the hierarchical structure existing in ICD labels to incorporate ontological relationships between ICD codes and enable structured output prediction. Experiments on medical code prediction on the MIMIC-III database demonstrate competitive performance, highlighting the benefits of this technique for robust clinical code assignment.

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41598-024-69214-9.pdf

Reference45 articles.

1. Xu, K. et al. Multimodal machine learning for automated ICD coding. In Machine Learning for Healthcare Conference, 197–215 (PMLR, 2019).

2. Luo, J., Xiao, C., Glass, L., Sun, J. & Ma, F. Fusion: Towards automated ICD coding via feature compression. In Findings of the Association for Computational Linguistics: ACL-IJCNLP2021, 2096–2101 (2021).

3. Zhang, Z., Liu, J. & Razavian, N. Bert-xml: Large scale automated ICD coding using bert pretraining. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, 24–34 (2020).

4. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186, (Association for Computational Linguistics, Minneapolis, Minnesota, 2019). https://doi.org/10.18653/v1/N19-1423

5. Taori, R. et al. Alpaca: A strong, replicable instruction-following model. Stanford Center Res. Found. Models 3, 7 (2023).