Abstract
AbstractThis study examines the query performance of the NBC++ (Incremental Naive Bayes Classifier) program for variations in canonicality, kmer size, databases, and input sample data size. NBC++ can successfully assess a wide range of superkingdoms using a small training database. We demonstrate that NBC++ and Kraken2 are affected by database depth with macro measures increasing with depth but that the full diversity of life, especially viruses, is still a challenge for these classifiers. NBC++ spends less time training but at the cost of long querying time. The major enhancements are to accommodate canonicalkmer storage (with major storage savings), adaptable and optimized memory allocation that quickens the query analysis and allows the classifier to be run on almost any system, and enables output of the log-likelihood values against each training genome which provides users with valualbe confidence information.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献