Abstract
AbstractSince December 2019, the world has been intensely affected by the COVID-19 pandemic, caused by the SARS-CoV-2 virus, first identified in Wuhan, China. In the case of a novel virus identification, the early elucidation of taxonomic classification and origin of the virus genomic sequence is essential for strategic planning, containment, and treatments. Deep learning techniques have been successfully used in many viral classification problems associated with viral infections diagnosis, metagenomics, phylogenetic, and analysis. This work proposes to generate an efficient viral genome classifier for the SARS-CoV-2 virus using the deep neural network (DNN) based on stacked sparse autoencoder (SSAE) technique. We performed four different experiments to provide different levels of taxonomic classification of the SARS-CoV-2 virus. The confusion matrix presented the validation and test sets and the ROC curve for the validation set. In all experiments, the SSAE technique provided great performance results. In this work, we explored the utilization of image representations of the complete genome sequences as the SSAE input to provide a viral classification of the SARS-CoV-2. For that, a dataset based on k-mers image representation, with k = 6, was applied. The results indicated the applicability of using this deep learning technique in genome classification problems.
Publisher
Cold Spring Harbor Laboratory
Reference40 articles.
1. Lam, T.T.Y. ; Shum, M.H.H. ; Zhu, H.C. ; Tong, Y.G. ; Ni, X.B. ; Liao, Y.S. ; Wei, W. ; Cheung, W.Y.M. ; Li, W.J. ; Li, L.F. ; others. Identifying SARS-CoV-2 related coronaviruses in Malayan pangolins. Nature 2020, pp. 1–6.
2. The proximal origin of SARS-CoV-2
3. Graham, R.L. ; Baric, R.S. SARS-CoV-2: Combating Coronavirus Emergence. Immunity 2020.
4. Alignment-free sequence comparison: benefits, applications, and tools
5. A primer on deep learning in genomics
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献