microGWAS: a computational pipeline to perform large scale bacterial genome-wide association studies-Reference-Cited by-同舟云学术

microGWAS: a computational pipeline to perform large scale bacterial genome-wide association studies

Published:2024-07-10 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Burgaya Judit^ORCID,Damaris Bamu F.^ORCID,Fiebig Jenny^ORCID,Galardini Marco^ORCID

Abstract

AbstractIdentifying genetic variants associated with bacterial phenotypes, such as virulence, host preference, and antimicrobial resistance, has great potential for a better understanding of the mechanisms involved in these traits. The availability of large collections of bacterial genomes has made genome-wide association studies (GWAS) a common approach for this purpose. The need to employ multiple software tools for data pre- and post-processing limits the application of these methods by experienced bioinformaticians. To address this issue, we have developed a pipeline to perform bacterial GWAS from a set of assemblies and annotations, with multiple phenotypes as targets. The associations are run using five sets of genetic variants: unitigs, gene presence/absence, rare variants (i.e. gene burden test), gene cluster specific k-mers, and all unitigs jointly. All variants passing the association threshold are further annotated to identify overrepresented biological processes and pathways. The results can be further augmented by generating a phylogenetic tree and by predicting the presence of antimicrobial resistance and virulence associated genes. We tested the microGWAS pipeline on a previously reported dataset onE. colivirulence, successfully identifying the causal variants, and providing further interpretation on the association results. The microGWAS pipeline integrates the state-of-the-art tools to perform bacterial GWAS into a single, user-friendly, and reproducible pipeline, allowing for the democratization of these analyses. The pipeline can be accessed, together with its documentation, at:https://github.com/microbial-pangenomes-lab/microGWAS.

Publisher

Cold Spring Harbor Laboratory

Reference40 articles.

1. Bacterial genomics: Microbial GWAS coming of age;Nat. Microbiol,2016

2. Genome-wide association studies reveal the role of polymorphisms affecting factor H binding protein expression in host invasion by Neisseria meningitidis;PLOS Pathog,2021

3. Major role of iron uptake systems in the intrinsic extra-intestinal virulence of the genus Escherichia revealed by a genome-wide association study

4. The bacterial genetic determinants of Escherichia coli capacity to cause bloodstream infections in humans

5. Dissecting Vancomycin-Intermediate Resistance in Staphylococcus aureus Using Genome-Wide Association