Abstract
Computational approaches for small molecule drug discovery now regularly scale to consideration of libraries containing billions of candidate small molecules. One promising path to increased speed in evaluating billion-molecule libraries is to develop representations of each molecule that enable fast computation of similarity between molecules. Molecular fingerprints have long provided a mechanism for succinct representation and fast comparison of small molecules, with a large collection of competing fingerprints. Here, we explore the utility of many of these fingerprints in the context of predicting similar molecular activity. We show that fingerprint similarity enables insufficient discriminative power between active and inactive molecules for a target protein based on a known active. We also demonstrate that, even when limited to only active molecules, fingerprint similarity values do not correlate with compound potency. In sum, these results highlight the need for a new wave of molecular representations that will improve the capacity to detect biologically active molecules based on similarity to other such molecules.
Publisher
Cold Spring Harbor Laboratory