Abstract
Deep neural networks have achieved state-of-the-art accuracy at classifying molecules with respect to whether they bind to specific protein targets. A key breakthrough would occur if these models could reveal the fragment pharmacophores that are causally involved in binding. Extracting chemical details of binding from the networks could enable scientific discoveries about the mechanisms of drug actions. However, doing so requires shining light into the black box that is the trained neural network model, a task that has proved difficult across many domains. Here we show how the binding mechanism learned by deep neural network models can be interrogated, using a recently described attribution method. We first work with carefully constructed synthetic datasets, in which the molecular features responsible for “binding” are fully known. We find that networks that achieve perfect accuracy on held-out test datasets still learn spurious correlations, and we are able to exploit this nonrobustness to construct adversarial examples that fool the model. This makes these models unreliable for accurately revealing information about the mechanisms of protein–ligand binding. In light of our findings, we prescribe a test that checks whether a hypothesized mechanism can be learned. If the test fails, it indicates that the model must be simplified or regularized and/or that the training dataset requires augmentation.
Funder
Simons Foundation
EC | FP7 | FP7 Ideas: European Research Council
NSF | Directorate for Mathematical and Physical Sciences
Publisher
Proceedings of the National Academy of Sciences
Reference30 articles.
1. Estimation of the size of drug-like chemical space based on GDB-17 data;Polishchuk;J. Comput.-Aided Mol. Des.,2013
2. Virtual screening of chemical libraries
3. Automating drug discovery;Schneider;Nat. Rev. Drug. Discov.,2017
4. Statistical and machine learning approaches to predicting protein-ligand interactions;Colwell;Curr. Opin. Struct. Biol.,2018
5. G. E. Dahl , N. Jaitly , R. Salakhutdinov , Multi-task neural networks for QSAR predictions. arXiv:1406.1231 (4 June 2014).
Cited by
52 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献