Author:
Azhir Alaleh,Hügel Jonas,Tian Jiazi,Cheng Jingya,Bassett Ingrid V.,Bell Douglas S.,Bernstam Elmer V.,Farhat Maha R.,Henderson Darren W.,Lau Emily S.,Morris Michele,Semenov Yevgeniy R.,Triant Virginia A.,Visweswaran Shyam,Strasser Zachary H.,Klann Jeffrey G.,Murphy Shawn N.,Estiri Hossein
Abstract
AbstractScalable identification of patients with the post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms and the suboptimal accuracy, demographic biases, and underestimation of the PASC diagnosis code (ICD-10 U09.9). In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying research cohorts of PASC patients, defined as a diagnosis of exclusion. We used longitudinal electronic health records (EHR) data from over 295 thousand patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to exclude sequelae that prior conditions can explain. We performed independent chart reviews to tune and validate our precision phenotyping algorithm. Our PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying Long COVID patients compared to the U09.9 diagnosis code. Our algorithm identified a PASC research cohort of over 24 thousand patients (compared to about 6 thousand when using the U09.9 diagnosis code), with a 79.9 percent precision (compared to 77.8 percent from the U09.9 diagnosis code). Our estimated prevalence of PASC was 22.8 percent, which is close to the national estimates for the region. We also provide an in-depth analysis outlining the clinical attributes, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC. The PASC phenotyping method presented in this study boasts superior precision, accurately gauges the prevalence of PASC without underestimating it, and exhibits less bias in pinpointing Long COVID patients. The PASC cohort derived from our algorithm will serve as a springboard for delving into Long COVID’s genetic, metabolomic, and clinical intricacies, surmounting the constraints of recent PASC cohort studies, which were hampered by their limited size and available outcome data.
Publisher
Cold Spring Harbor Laboratory
Reference80 articles.
1. HHS. Long COVID terms and definitions development explained. COVID.gov https://www.covid.gov/longcovid/definitions(2022).
2. HHS. National Research Action Plan on Long COVID. https://www.covid.gov/assets/files/National-Research-Action-Plan-on-Long-COVID-08012022.pdf (2022).
3. Long COVID: An overview;Diabetes Metab. Syndr,2021
4. Crook, H. , Raza, S. , Nowell, J. , Young, M. & Edison, P. Long covid—mechanisms, risk factors, and management. BMJ 374, (2021).
5. Smallwood, M. The Future of Long COVID: A Threatcasting Approach. (Springer Nature, 2023).