Making the Most of Its Short Reads: A Bioinformatics Workflow for Analysing the Short-Read-Only Data of Leishmania orientalis (Formerly Named Leishmania siamensis) Isolate PCM2 in Thailand
Author:
Anuntasomboon Pornchai,
Siripattanapipong Suradej,
Unajak SasimanasORCID,
Choowongkomon KiattaweeORCID,
Burchmore RichardORCID,
Leelayoova Saovanee,
Mungthin Mathirut,
E-kobon TeerasakORCID
Abstract
Background: Leishmania orientalis (formerly named Leishmania siamensis) has been neglected for years in Thailand. The genomic study of L. orientalis has gained much attention recently after the release of the first high-quality reference genome of the isolate LSCM4. The integrative approach of multiple sequencing platforms for whole-genome sequencing has proven effective at the expense of considerably expensive costs. This study presents a preliminary bioinformatic workflow including the use of multi-step de novo assembly coupled with the reference-based assembly method to produce high-quality genomic drafts from the short-read Illumina sequence data of L. orientalis isolate PCM2. Results: The integrating multi-step de novo assembly by MEGAHIT and SPAdes with the reference-based method using the L. enriettii genome and salvaging the unmapped reads resulted in the 30.27 Mb genomic draft of L. orientalis isolate PCM2 with 3367 contigs and 8887 predicted genes. The results from the integrated approach showed the best integrity, coverage, and contig alignment when compared to the genome of L. orientalis isolate LSCM4 collected from the northern province of Thailand. Similar patterns of gene ratios and frequency were observed from the GO biological process annotation. Fifty GO terms were assigned to the assembled genomes, and 23 of these (accounting for 61.6% of the annotated genes) showed higher gene counts and ratios when results from our workflow were compared to those of the LSCM4 isolate. Conclusions: These results indicated that our proposed bioinformatic workflow produced an acceptable-quality genome of L. orientalis strain PCM2 for functional genomic analysis, maximising the usage of the short-read data. This workflow would give extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation.
Funder
Kasetsart University Research and Development Institute
Subject
General Agricultural and Biological Sciences,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology
Reference80 articles.
1. Leishmaniasis Worldwide and Global Estimates of Its Incidence
2. Leishmania in phlebotomid sandflies: VI. Importance of hindgut development in distinguishing between parasites of the Leishmania mexicana and L. braziliensis complexes;Lainson;Proc. R. Soc. Lond. Ser. B Biol. Sci.,1977
3. Advances in leishmaniasis
4. The history of leishmaniasis
5. Risk Factors for Adverse Prognosis and Death in American Visceral Leishmaniasis: A Meta-analysis