Abstract
AbstractTrypanosoma cruziis the causative agent of Chagas disease, which causes 10,000 deaths per year. Despite the high mortality caused by the pathogen, relatively few parasite genomes have been assembled to date; even some commonly used laboratory strains do not have publicly available genome assemblies. This is at least partially due toT. cruzi’s highly complex and highly repetitive genome: while describing the variation in genome content and structure is critical to better understandingT. cruzibiology and the mechanisms that underlie Chagas disease, the complexity of the genome defies investigation using traditional short read sequencing methods. Here, we have generated a high-quality whole genome assembly of the hybrid Tulahuen strain, a commercially available Type VI strain, using long read Nanopore sequencing without short read scaffolding. Using automated tools and manual curation for annotation, we report a genome with 25% repeat regions, 17% variable multigene family members, and 27% transposable elements. Notably, we find that regions with transposable elements are significantly enriched for surface proteins, and that on average surface proteins are closer to transposable elements compared to other coding regions. This finding supports a possible mechanism for diversification of surface proteins in which mobile genetic elements such as transposons facilitate recombination within the gene family. This work demonstrates the feasibility of nanopore sequencing to resolve complex regions ofT. cruzigenomes, and with these resolved regions, provides support for a possible mechanism for genomic diversification.
Publisher
Cold Spring Harbor Laboratory