Using comparative genome analysis to identify problems in annotated microbial genomes-Reference-Cited by-同舟云学术

Using comparative genome analysis to identify problems in annotated microbial genomes

Published:2010-07-01 Issue:7 Volume:156 Page:1909-1917
ISSN:1350-0872
Container-title:Microbiology
language:en
Short-container-title:

Author:

Poptsova Maria S.¹,Gogarten J. Peter¹

Affiliation:

1. Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269-3125, USA

Abstract

Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.

Publisher

Microbiology Society

Subject

Microbiology

Reference52 articles.

1. Importing statistical measures into Artemis enhances gene identification in the Leishmania genome project;Aggarwal;BMC Bioinformatics,2003

2. Basic local alignment search tool;Altschul;J Mol Biol,1990

3. Proteogenomics: needs and roles to be filled by proteomics in genome annotation;Ansong;Brief Funct Genomic Proteomic,2008

4. Automatic identification of large collections of protein-coding or rRNA sequences;Arigon;Biochimie,2008

5. A perfect genome annotation is within reach with the proteomics and genomics alliance;Armengaud;Curr Opin Microbiol,2009

Cited by 84 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Proteins à la carte: riboproteogenomic exploration of bacterial N-terminal proteoform expression;mBio;2024-04-10

2. The interkingdom horizontal gene transfer in 44 early diverging fungi boosted their metabolic, adaptive, and immune capabilities;Evolution Letters;2024-03-05

3. Molecular and Genetic Characterization of Colicinogenic Escherichia coli Strains Active against Shiga Toxin-Producing Escherichia coli O157:H7;Foods;2023-07-11

4. Uncovering Pseudogenes and Intergenic Protein-coding Sequences in TriTryps’ Genomes;Genome Biology and Evolution;2022-10-01

5. Helicobacter pylori virulence factors: relationship between genetic variability and phylogeographic origin;PeerJ;2021-11-26