Abstract
ABSTRACTThe cyanobacteriochrome GAF domains represent a trove of spectral diversity. These proteins are endemic to cyanobacteria and sense the color and power of light. Multiple mechanisms are used to tune the natural absorbance spectrum of the bound bilin chromophore. In practice, these are difficult to identify from the predicted amino acid sequence. Their individual presence rarely yields a consistent and predictable outcome. The absorbance characteristics of the GAF domain are a complex function of many such tuning mechanisms. This implies that a more combinatoric approach to characterizing the diversity of GAF domains would better to predict spectral tunes. We reviewed the literature and constructed a dataset of predicted/confirmed cyanobacteriochrome GAF domains. This dataset was subjected to multiple sequence alignments and 18 GAF domain families were defined. The amino acid sequence similarity correlated well with known spectral characteristics but there were exceptions. A second approach to predict chromotype involved using Principal Component Analysis to characterize the whole domain architectures of cyanobacteriochrome. This approach identified 7 conserved domain architectures, with some variations. These also offered a correlation to the spectral tune of the GAF domains therein, in addition to the 18 GAF families. The three-dimensional structures of 98 spectrally characterized GAF domains were predicted using Phyre2. Subsequent grouping based on distance maps offered an insight into how the general spectral position of the domain is set. Finer tuning is likely to be achieved by means of six key residues within the binding pocket. Taken together, these insights allowed us to carry out a Multiple Correlation Analysis serving as a mathematical summary of the diversity of cyanobacteriochrome GAF domains. This summary or “cyanobacteriochrome atlas” can be used to make spectral predictions on uncharacterized GAF domains.
Publisher
Cold Spring Harbor Laboratory