Abstract
AbstractMetabolomic studies have improved the understanding of a broad range of biological tissues, fluids, and systems. Typically, metabolomic analyses employ a workflow that starts with detecting peaks from mass spectrometry data and is followed by a series of statistical analysis aimed at identifying dysregulated metabolites, group differences, and group similarities in dysregulated metabolites and pathways. Generating these group similarities relies on clustering analyses. However, current clustering methods are highly subjective and can be prone to errors, indicating the need for an updated workflow that improves upon these issues. Here we present a novel metabolomics workflow that can produce unbiased, reproducible clustering results: ensemble clustering combined with cluster optimization (ECCO). The first step, clustering optimization, is used to identify an optimal number of clusters without bias. The second step, ensemble clustering, is then performed by finding the consensus clustering solution across thirteen distance algorithms. This step improves the repeatability of analyses and eliminates bias through eliminating the need to choose one distance algorithm in clustering solutions. We employ ECCO to analyze synovial fluid metabolites from patients with early and late osteoarthritis (OA). This method improves upon the detection of distinct metabolomic endotypes compared with conventional analyses. Furthermore, novel pathways were identified corresponding with different stages of OA. These results demonstrate the utility of ECCO in metabolomics workflows that involve clustering data. ECCO, which we provide as an open-source tool, can improve the repeatability, reliability, and ease-of-use of metabolomics analyses, and is therefore expected to increase the confidence of biological interpretation from these data.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献