Abstract
Abstract
Introduction
The course of COVID-19 varies from asymptomatic to severe in patients. The basis for this range in symptoms is unknown. One possibility is that genetic variation is partly responsible for the highly variable response. We evaluated how well a genetic risk score based on chromosomal-scale length variation and machine learning classification algorithms could predict severity of response to SARS-CoV-2 infection.
Methods
We compared 981 patients from the UK Biobank dataset who had a severe reaction to SARS-CoV-2 infection before 27 April 2020 to a similar number of age-matched patients drawn for the general UK Biobank population. For each patient, we built a profile of 88 numbers characterizing the chromosomal-scale length variability of their germ line DNA. Each number represented one quarter of the 22 autosomes. We used the machine learning algorithm XGBoost to build a classifier that could predict whether a person would have a severe reaction to COVID-19 based only on their 88-number classification.
Results
We found that the XGBoost classifier could differentiate between the two classes at a significant level (p = 2 · 10−11) as measured against a randomized control and (p = 3 · 10−14) as measured against the expected value of a random guessing algorithm (AUC = 0.5). However, we found that the AUC of the classifier was only 0.51, too low for a clinically useful test.
Conclusion
Genetics play a role in the severity of COVID-19, but we cannot yet develop a useful genetic test to predict severity.
Publisher
Springer Science and Business Media LLC
Subject
Drug Discovery,Genetics,Molecular Biology,Molecular Medicine
Reference15 articles.
1. Kenney AD, Dowdle JA, Bozzacco L, McMichael TM, St Gelais C, Panfil AR, et al. Human genetic determinants of viral diseases. Annual review of genetics [Internet]. Annu Rev Genet; 2017 [cited 2020 Jun 15];51:241–63. Available from: http://www.ncbi.nlm.nih.gov/pubmed/28853921.
2. Everitt AR, Clare S, Pertel T, John SP, Wash RS, Smith SE, et al. IFITM3 restricts the morbidity and mortality associated with influenza. Nature [Internet]. Nature; 2012 [cited 2020 Jun 15];484:519–23. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22446628.
3. Toh C, Brody JP. Analysis of copy number variation from germline DNA can predict individual cancer risk. bioRxiv [Internet]. Cold Spring Harbor Laboratory; 2018 [cited 2018 Jun 3];303339. Available from: https://www.biorxiv.org/content/early/2018/04/17/303339.
4. Ding X, Tsang S-Y, Ng S-K, Xue H. Application of machine learning to development of copy number variation-based prediction of cancer risk. Genomics Insights [Internet]. SAGE PublicationsSage UK: London, England; 2014 [cited 2020 Sep 14];7:GEI.S15002. Available from: http://journals.sagepub.com/doi/10.4137/GEI.S15002.
5. Davies NG, Klepac P, Liu Y, Prem K, Jit M, Eggo RM. Age-dependent effects in the transmission and control of COVID-19 epidemics. Nature Medicine [Internet]. Nature Publishing Group; 2020 [cited 2020 22];1–7. Available from: http://www.nature.com/articles/s41591-020-0962-9.
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献