Bipartite graph-based approach for clustering of cell lines by gene expression–drug response associations-Reference-Cited by-同舟云学术

Bipartite graph-based approach for clustering of cell lines by gene expression–drug response associations

Published:2021-03-03 Issue:17 Volume:37 Page:2617-2626
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Chi Calvin¹^ORCID,Ye Yuting²,Chen Bin³⁴,Huang Haiyan¹⁵

Affiliation:

1. Center of Computational Biology, College of Engineering, University of California, Berkeley, CA 94720, USA

2. Division of Biostatistics, University of California, Berkeley, CA 94720, USA

3. Department of Pediatrics and Human Development, Michigan State University, Grand Rapids, MI 48912, USA

4. Department of Pharmacology and Toxicology, Michigan State University, Grand Rapids, MI 48824, USA

5. Department of Statistics, University of California, Berkeley, CA 94720, USA

Abstract

Abstract Motivation In pharmacogenomic studies, the biological context of cell lines influences the predictive ability of drug-response models and the discovery of biomarkers. Thus, similar cell lines are often studied together based on prior knowledge of biological annotations. However, this selection approach is not scalable with the number of annotations, and the relationship between gene–drug association patterns and biological context may not be obvious. Results We present a procedure to compare cell lines based on their gene–drug association patterns. Starting with a grouping of cell lines from biological annotation, we model gene–drug association patterns for each group as a bipartite graph between genes and drugs. This is accomplished by applying sparse canonical correlation analysis (SCCA) to extract the gene–drug associations, and using the canonical vectors to construct the edge weights. Then, we introduce a nuclear norm-based dissimilarity measure to compare the bipartite graphs. Accompanying our procedure is a permutation test to evaluate the significance of similarity of cell line groups in terms of gene–drug associations. In the pharmacogenomic datasets CTRP2, GDSC2 and CCLE, hierarchical clustering of carcinoma groups based on this dissimilarity measure uniquely reveals clustering patterns driven by carcinoma subtype rather than primary site. Next, we show that the top associated drugs or genes from SCCA can be used to characterize the clustering patterns of haematopoietic and lymphoid malignancies. Finally, we confirm by simulation that when drug responses are linearly dependent on expression, our approach is the only one that can effectively infer the true hierarchy compared to existing approaches. Availability and implementation Bipartite graph-based hierarchical clustering is implemented in R and can be obtained from CRAN: https://CRAN.R-project.org/package=hierBipartite. The source code is available at https://github.com/CalvinTChi/hierBipartite. The datasets were derived from sources in the public domain, which are the Cancer Cell Line Encyclopedia (https://portals.broadinstitute.org/ccle), the Cancer Therapeutics Response Portal (https://portals.broadinstitute.org/ctrp.v2.1/?page=#ctd2BodyHome), and the Genomics of Drug Sensitivity in Cancer (https://www.cancerrxgene.org/). These datasets can be downloaded using the PharmacoGx R package (https://bioconductor.org/packages/release/bioc/html/PharmacoGx.html). Supplementary information Supplementary data are available at Bioinformatics online.

Funder

National Science Foundation Graduate Research Fellowship Program

National Institutes of Health

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btab143/38924164/btab143.pdf

Reference51 articles.

1. Tandem: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types;Aben;Bioinformatics,2016

2. Machine learning approaches to drug response prediction: challenges and recent progress;Adam;NPJ Precision Oncol,2020

3. Evidence for the existence of a cxcl17 receptor distinct from gpr35;Amir;J. Immunol,2018

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Consensus Clustering for Robust Bioinformatics Analysis;2024-03-23

2. Snowflake: visualizing microbiome abundance tables as multivariate bipartite graphs;Frontiers in Bioinformatics;2024-02-05

3. Glioblastoma vulnerability to neddylation inhibition is dependent on PTEN status, and dysregulation of the cell cycle and DNA replication;Neuro-Oncology Advances;2024-01-01

4. Comprehensive pan-cancer analysis reveals CCDC58 as a carcinogenic factor related to immune infiltration;Apoptosis;2023-12-08

5. Fairness-aware Maximal Biclique Enumeration on Bipartite Graphs;2023 IEEE 39th International Conference on Data Engineering (ICDE);2023-04