Abstract
AbstractFragment screening is used to identify binding sites and leads in drug discovery, but it is often unclear which binding sites are functionally important. Here, data from 37 experiments, and 1309 protein structures binding to 1601 ligands were analysed. A method to group ligands by binding sites is introduced and sites clustered according to profiles of relative solvent accessibility. This identified 293 unique ligand binding sites, grouped into four clusters (C1-4). C1 includes larger, buried, conserved, and population missense-depleted sites, enriched in known functional sites. C4 comprises smaller, accessible, divergent, missense-enriched sites, depleted in functional sites. A site in C1 is 28 times more likely to be functional than one in C4. Seventeen sites, which to the best of our knowledge are novel, in 13 proteins are identified as likely to be functionally important with examples from human tenascin and 5-aminolevulinate synthase highlighted. A multi-layer perceptron, and K-nearest neighbours model are presented to predict cluster labels for ligand binding sites with an accuracy of 96% and 100%, respectively, so allowing functional classification of sites for proteins not in this set. Our findings will be of interest to those studying protein-ligand interactions and developing new drugs or function modulators.
Funder
RCUK | Biotechnology and Biological Sciences Research Council
Wellcome Trust
Publisher
Springer Science and Business Media LLC
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献