bootGSEA: a bootstrap and rank aggregation pipeline for multi-study and multi-omics enrichment analyses-Reference-Cited by-同舟云学术

bootGSEA: a bootstrap and rank aggregation pipeline for multi-study and multi-omics enrichment analyses

Published:2024-04-03 Issue: Volume:4 Page:
ISSN:2673-7647
Container-title:Frontiers in Bioinformatics
language:
Short-container-title:Front. Bioinform.

Author:

Hemandhar Kumar Shamini,Tapken Ines,Kuhn Daniela,Claus Peter,Jung Klaus

Abstract

Introduction: Gene set enrichment analysis (GSEA) subsequent to differential expression analysis is a standard step in transcriptomics and proteomics data analysis. Although many tools for this step are available, the results are often difficult to reproduce because set annotations can change in the databases, that is, new features can be added or existing features can be removed. Finally, such changes in set compositions can have an impact on biological interpretation.Methods: We present bootGSEA, a novel computational pipeline, to study the robustness of GSEA. By repeating GSEA based on bootstrap samples, the variability and robustness of results can be studied. In our pipeline, not all genes or proteins are involved in the different bootstrap replicates of the analyses. Finally, we aggregate the ranks from the bootstrap replicates to obtain a score per gene set that shows whether it gains or loses evidence compared to the ranking of the standard GSEA. Rank aggregation is also used to combine GSEA results from different omics levels or from multiple independent studies at the same omics level.Results: By applying our approach to six independent cancer transcriptomics datasets, we showed that bootstrap GSEA can aid in the selection of more robust enriched gene sets. Additionally, we applied our approach to paired transcriptomics and proteomics data obtained from a mouse model of spinal muscular atrophy (SMA), a neurodegenerative and neurodevelopmental disease associated with multi-system involvement. After obtaining a robust ranking at both omics levels, both ranking lists were combined to aggregate the findings from the transcriptomics and proteomics results. Furthermore, we constructed the new R-package “bootGSEA,” which implements the proposed methods and provides graphical views of the findings. Bootstrap-based GSEA was able in the example datasets to identify gene or protein sets that were less robust when the set composition changed during bootstrap analysis.Discussion: The rank aggregation step was useful for combining bootstrap results and making them comparable to the original findings on the single-omics level or for combining findings from multiple different omics levels.

Publisher

Frontiers Media SA

Reference55 articles.

1. A general modular framework for gene set enrichment analysis;Ackermann;BMC Bioinforma.,2009

2. Gene set enrichment analysis with topgo;Alexa;Bioconductor Improv,2009

3. Improved scoring of functional groups from gene expression data by decorrelating go graph structure;Alexa;Bioinformatics,2006

4. Renal pathology in a mouse model of severe spinal muscular atrophy is associated with downregulation of glial cell-line derived neurotrophic factor (gdnf);Allardyce;Hum. Mol. Genet.,2020

5. Comparative study on gene set and pathway topology-based enrichment methods;Bayerlová;BMC Bioinforma.,2015

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Combined Analysis of Multi-Study miRNA and mRNA Expression Data Shows Overlap of Selected miRNAs Involved in West Nile Virus Infections;Genes;2024-08-05