Abstract
AbstractThe comparison of protein conformational ensembles is of central importance in structural biology. However, there are few computational methods for ensemble comparison, and those that are readily available, such as ENCORE, utilize methods that are sufficiently computationally expensive to be prohibitive for large ensembles. Here, a new method is presented for efficient representation and comparison of protein conformational ensembles. The method is based on the representation of a protein ensemble as a vector of probability distribution functions (pdfs), with each pdf representing the distribution of a local structural property such as the number of contacts between Cβatoms. Dissimilarity between two conformational ensembles is quantified by the Jensen Shannon distance between the corresponding set of probability distribution functions. The method is validated for conformational ensembles generated by molecular dynamics simulations of ubiquitin, as well as experimentally derived conformational ensembles of a 130 amino acid truncated form of human tau protein. In the ubiquitin ensemble dataset, the method was up to 88 times faster than the existing ENCORE software, while simultaneously utilizing 48 times fewer computing cores. We make the method available as a Python package, called PROTHON, and provide a GitHub page with the Python source code athttps://github.com/PlotkinLab/Prothon.
Publisher
Cold Spring Harbor Laboratory