Affiliation:
1. Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
Abstract
The assembly evaluation process is the starting step towards meaningful downstream data
analysis. We need to know how much accurate information is included in an assembled sequence before
going further to any data analysis stage. Four basic metrics are targeted by different assembly
evaluation tools: contiguity, accuracy, completeness, and contamination. Some tools evaluate these
metrics based on comparing the assembly results to a closely related reference. Others utilize different
types of heuristics to overcome the missing guiding reference, such as the consistency between assembly
results and sequencing reads. In this paper, we discuss the assembly evaluation process as a
core stage in any sequence assembly pipeline and present a roadmap that is followed by most assembly
evaluation tools to assess different metrics. We highlight the challenges that currently exist in the
assembly evaluation tools and summarize their technical and practical details to help the end-users
choose the best tool according to their working scenarios. To address the similarities/differences
among different assembly assessment tools, including their evaluation approaches, metrics, comprehensive
nature, limitations, usability and how the evaluated results are presented to the end-user, we
provide a practical example for evaluating Velvet assembly results for S. aureus dataset from GAGE
competition. A Github repository (https://github.com/SaraEl-Metwally/Assembly-Evaluation-Tools) is
created for evaluation result details along with their generated command line parameters.
Publisher
Bentham Science Publishers Ltd.
Subject
Computational Mathematics,Genetics,Molecular Biology,Biochemistry
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献