Affiliation:
1. School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
Abstract
Bacterial and archaeal flagellins are remarkable in having a shared region with variation in housekeeping proteins and a region with extreme diversity, perhaps greater than for any other protein. Analysis of the 113,285 available full-gene sequences of flagellin genes from published bacterial and archaeal sequences revealed the nature and enormous extent of flagellin diversity. There were 35,898 unique amino acid sequences that were resolved into 187 clusters. Analysis of the
Escherichia coli
and
Salmonella enterica
flagellins revealed that the variation occurs at two levels. The first is the division of the variable regions into sequence forms that are so divergent that there is no meaningful alignment even within species, and these corresponded to the
E. coli
or
S. enterica
H-antigen groups. The second level is variation within these groups, which is extensive in both species. Shared sequence would allow PCR of the variable regions and thus strain-level analysis of microbiome DNA.
Publisher
American Society for Microbiology
Subject
Computer Science Applications,Genetics,Molecular Biology,Modelling and Simulation,Ecology, Evolution, Behavior and Systematics,Biochemistry,Physiology,Microbiology