Abstract
AbstractIn macromolecular structure determination using X-ray diffraction from multiple crystals, the presence of different structures (structural polymorphs) necessitates the classification of diffraction data for appropriate structural analysis. Hierarchical clustering analysis (HCA) is a promising technique that has so far been used to extract isomorphous data, mainly for single structure determination. Although in principle the use of HCA can be extended to detect polymorphs, the absence of a reference for defining a ‘similarity’ threshold used for grouping the isomorphous datasets poses a challenge. Here, we have applied unit cell-based and intensity-based HCAs to the datasets of apo-trypsin and inhibitor-bound trypsin that were mixed post-data acquisition to investigate how effective HCA is in classifying polymorphous datasets. Single-step intensity-based HCA successfully classified polymorphs with a certain ‘similarity’ threshold. In datasets of several samples containing an unknown degree of structural heterogeneity, polymorphs could be identified by intensity-based HCA using the suggested ‘similarity’ threshold. Polymorphs were also detected in single crystals using the data collected by the continuous helical scan method. These findings are expected to facilitate the determination of multiple structural snapshots by exploiting automated data collection and analysis.SynopsisSingle-step intensity-based hierarchical clustering is demonstrated to allow the detection of structural polymorphs in the diffraction datasets obtained from multiple crystals. By splitting the datasets collected by continuous helical scan into several chunks, both inter and intra-crystal polymorphs can be successfully analyzed.
Publisher
Cold Spring Harbor Laboratory