Similarity metric learning on perturbational datasets improves functional identification of perturbations-Reference-Cited by-同舟云学术

Similarity metric learning on perturbational datasets improves functional identification of perturbations

Published:2023-06-11 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Smith Ian^ORCID,Smirnov Petr^ORCID,Haibe-Kains Benjamin^ORCID

Abstract

AbstractAnalysis of high-throughput perturbational datasets, including the Next Generation Connectivity Map (L1000) and the Cell Painting projects, uses similarity metrics to identify perturbations or disease states that induce similar changes in the biological feature space. Similarities among perturbations are then used to identify drug mechanisms of action, to nominate therapeutics for a particular disease, and to construct bio-logical networks among perturbations and genes. Standard similarity metrics include correlations, cosine distance and gene set enrichment methods, but these methods operate on the measured features without refinement by transforming the measurement space. We introduce Perturbational Metric Learning (PeML), a weakly supervised similarity metric learning method to learn a data-driven similarity function that maximizes discrimination of replicate signatures by transforming the biological measurements into an intrinsic, dataset-specific basis. The learned similarity functions show substantial improvement for recovering known biological relationships, like mechanism of action identification. In addition to capturing a more meaningful notion of similarity, data in the transformed basis can be used for other analysis tasks, such as classification and clustering. Similarity metric learning is a powerful tool for the analysis of large biological datasets.

Publisher

Cold Spring Harbor Laboratory

Reference38 articles.

1. Highly multiplexed imaging of single cells using a high-throughput cyclic immunofluorescence method;Nature Communications,2015

2. A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. BioEncoder: A metric learning toolkit for comparative organismal biology;Ecology Letters;2024-08

2. Morphological profiling for drug discovery in the era of deep learning;Briefings in Bioinformatics;2024-05-23

3. Spatial domains identification in spatial transcriptomics by domain knowledge-aware and subspace-enhanced graph contrastive learning;2024-05-10

4. BioEncoder: a metric learning toolkit for comparative organismal biology;2024-04-05