Author:
Pfennig Aaron,Lomsadze Alexandre,Borodovsky Mark
Abstract
AbstractSome of recently discovered in human gut microbiome highly divergent crAssphages were reported to use multiple genetic codes. Opal or amber stop codon reassignments were present in parts of the genomes, while the standard genetic code was used in the remaining genome sections. Essentially, the phage genomes were divided into distinct blocks where one or another code was used. We have developed a tool, Mgcod, that identifies blocks with specific genetic codes and annotates protein-coding regions. We used Mgcod to scan a large set of human metagenomic contigs. As a result, we identified hundreds of contigs of viral origin with the standard genetic code used in some parts while genetic codes with opal or amber stop codon reassignments were used in others. Many of these contigs originated from known crAssphages. Further investigation revealed that while the genes in one genomic block could be translated by a distinct genetic code, translation of genes by either of the two genetic codes genes in an adjacent block would produce proteins with little difference from each other. The dual-coded genes were enriched with early-stage phage genes, while a single code was used for the late-stage genes. The code-block structure expands the phage’s ability to infect bacteria whose genomes employ the standard genetic code. The new tool provides means for accurate annotation of unusual genomes of these phages.
Publisher
Cold Spring Harbor Laboratory