Benchmarking enrichment analysis methods with the disease pathway network

Author:

Buzzao Davide1ORCID,Castresana-Aguirre Miguel2,Guala Dimitri1ORCID,Sonnhammer Erik L L1

Affiliation:

1. Department of Biochemistry and Biophysics, Stockholm University , Science for Life Laboratory, Box 1031, 171 21 Solna , Sweden

2. K7 Department of Oncology-Pathology, Karolinska Institute , 171 77 Stockholm , Sweden

Abstract

Abstract Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P-values.

Funder

Swedish Research Council

Stockholm University

Publisher

Oxford University Press (OUP)

Reference48 articles.

1. Gene ontology consortium: going forward;Gene Ontology Consortium;Nucleic Acids Res,2015

2. Data, information, knowledge and principle: back to metabolism in KEGG;Kanehisa;Nucleic Acids Res,2014

3. Reactome: a database of reactions, pathways and biological processes;Croft;Nucleic Acids Res,2011

4. The DisGeNET knowledge platform for disease genomics: 2019 update;Piñero;Nucleic Acids Res,2020

5. Molecular signatures database (MSigDB) 3.0;Liberzon;Bioinformatics,2011

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3