Omada: robust clustering of transcriptomes through multiple testing-Reference-Cited by-同舟云学术

Omada: robust clustering of transcriptomes through multiple testing

Published:2024 Issue: Volume:13 Page:
ISSN:2047-217X
Container-title:GigaScience
language:en
Short-container-title:

Author:

Kariotis Sokratis¹²³^ORCID,Tan Pei Fang¹²^ORCID,Lu Haiping⁴^ORCID,Rhodes Christopher J³^ORCID,Wilkins Martin R³^ORCID,Lawrie Allan³^ORCID,Wang Dennis¹²³⁴^ORCID

Affiliation:

1. Singapore Institute for Clinical Sciences, Agency for Science, Technology and Research (A*STAR) , 30 Medical Dr, 117609, Singapore , Republic of Singapore

2. Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR) , 30 Biopolis St, Matrix, 138671, Singapore , Republic of Singapore

3. National Heart and Lung Institute, Imperial College London , Guy Scadding Building, Dovehouse St, SW3 6LY, London , United Kingdom

4. Department of Computer Science, University of Sheffield , Regent Court, 211 Portobello, S1 4DP, Sheffield , United Kingdom

Abstract

Abstract Background Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High-throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, but selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this, we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning–based functions. Findings The efficiency of each tool was tested with 7 datasets characterized by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements. Conclusions In conclusion, Omada successfully automates the robust unsupervised clustering of transcriptomic data, making advanced analysis accessible and reliable even for those without extensive machine learning expertise. Implementation of Omada is available at http://bioconductor.org/packages/omada/.

Funder

British Heart Foundation

Publisher

Oxford University Press (OUP)

Link

https://academic.oup.com/gigascience/article-pdf/doi/10.1093/gigascience/giae039/58512436/giae039.pdf

Reference81 articles.

1. Complementing tissue characterization by integrating transcriptome profiling from the Human Protein Atlas and from the FANTOM5 consortium;Yu;Nucleic Acids Res,2015

2. The Genotype-Tissue Expression (GTEx) Project: linking clinical data with molecular analysis to advance personalized medicine;Keen;J Pers Med,2015

3. Proteomics. Tissue-based map of the human proteome;Uhlén;Science,2015

4. RNA sequencing-based longitudinal transcriptomic profiling gives novel insights into the disease mechanism of generalized pustular psoriasis;Wang;BMC Med Genomics,2018

5. Molecular subtyping of Alzheimer's disease using RNA sequencing data reveals novel mechanisms and targets;Neff;Sci Adv,2021

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Omada: robust clustering of transcriptomes through multiple testing;GigaScience;2024