Abstract
AbstractAlthough the three-dimensional structures of G-protein-coupled receptors (GPCRs), the largest superfamily of drug targets, have enabled structure-based drug design, there are no structures available for 87% of GPCRs. This is due to the stiff challenge in purifying the inherently flexible GPCRs. Identifying thermostabilized mutant GPCRs via systematic alanine scanning mutations has been a successful strategy in stabilizing GPCRs, but it remains a daunting task for each GPCR. We developed a computational method that combines sequence, structure and dynamics based molecular properties of GPCRs that recapitulate GPCR stability, with four different machine learning methods to predict thermostable mutations ahead of experiments. This method has been trained on thermostability data for 1231 mutants, the largest publicly available dataset. A blind prediction for thermostable mutations of the Complement factor C5a Receptor retrieved 36% of the thermostable mutants in the top 50 prioritized mutants compared to 3% in the first 50 attempts using systematic alanine scanning.Statement Of SignifiganceG-protein-coupled receptors (GPCRs), the largest superfamily of membrane proteins play a vital role in cellular physiology and are targets to blockbuster drugs. Hence it is imperative to solve the three dimensional structures of GPCRs in various conformational states with different types of ligands bound. To reduce the experimental burden in identifying thermostable GPCR mutants, we report a computational framework using machine learning algorithms trained on thermostability data for 1231 mutants and features calculated from analysis of GPCR sequences, structure and dynamics to predict thermostable mutations ahead of experiments. This work represents a significant advancement in the development, validation and testing of a computational framework that can be extended to other class A GPCRs and helical membrane proteins.
Publisher
Cold Spring Harbor Laboratory