Hybrid Value-Aware Transformer Architecture for Joint Learning from Longitudinal and Non-Longitudinal Clinical Data-Reference-Cited by-同舟云学术

Hybrid Value-Aware Transformer Architecture for Joint Learning from Longitudinal and Non-Longitudinal Clinical Data

Published:2023-06-29 Issue:7 Volume:13 Page:1070
ISSN:2075-4426
Container-title:Journal of Personalized Medicine
language:en
Short-container-title:JPM

Author:

Shao Yijun¹²,Cheng Yan¹²^ORCID,Nelson Stuart J.¹^ORCID,Kokkinos Peter¹²³,Zamrini Edward Y.¹²⁴⁵,Ahmed Ali¹²⁶^ORCID,Zeng-Treitler Qing¹²^ORCID

Affiliation:

1. Department of Clinical Research and Leadership, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA

2. Washington DC VA Medical Center, Washington, DC 20422, USA

3. Department of Kinesiology and Health, School of Arts and Sciences, Rutgers University, New Brunswick, NJ 08901, USA

4. Department of Neurology, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA

5. Irvine Clinical Research, Irvine, CA 92614, USA

6. Department of Medicine, School of Medicine, Georgetown University, Washington, DC 20057, USA

Abstract

Transformer is the latest deep neural network (DNN) architecture for sequence data learning, which has revolutionized the field of natural language processing. This success has motivated researchers to explore its application in the healthcare domain. Despite the similarities between longitudinal clinical data and natural language data, clinical data presents unique complexities that make adapting Transformer to this domain challenging. To address this issue, we have designed a new Transformer-based DNN architecture, referred to as Hybrid Value-Aware Transformer (HVAT), which can jointly learn from longitudinal and non-longitudinal clinical data. HVAT is unique in the ability to learn from the numerical values associated with clinical codes/concepts such as labs, and in the use of a flexible longitudinal data representation called clinical tokens. We have also trained a prototype HVAT model on a case-control dataset, achieving high performance in predicting Alzheimer’s disease and related dementias as the patient outcome. The results demonstrate the potential of HVAT for broader clinical data-learning tasks.

Funder

U.S. National Institute of Health/National Institute on Aging

Publisher

MDPI AG

Subject

Medicine (miscellaneous)

Link

https://www.mdpi.com/2075-4426/13/7/1070/pdf

Reference28 articles.

1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017, January 4–9). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA.

2. Devlin, J., Change, M.-W., Lee, K., and Toutanova, K. (2019, January 2–7). BERT: Pre-training of Deep Bidirectinal Transformers for Language Understanding. Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA.

3. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training, OpenAI.

4. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and Sutskever, I. (2019). Language Models Are Unsupervised Multitask Learners, OpenAI.

5. Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020). Language Models are Few-Shot Learners. arXiv.