Abstract
Abstract
Gene expression changes constantly with the occurrence and progression of diseases. The vast available gene expression data makes it possible for clinical researchers to understand the link between genotypes and phenotypes. However, it is still not an easy task because the information contained in the gene expression matrix is sparse. Gene set enrichment analysis is a powerful tool to meet the challenge of identifying complicated differential information underlying pathways. In this paper, we propose a method, called GSEMT, for gene set enrichment analysis by testing the correlation between a sample similarity matrix and a phenotype dissimilarity matrix. We implement experiments on knowledge-based gene sets and gene expression datasets for hepatocellular carcinoma. We justify the effectiveness and advantage of GSEMT by carrying out comparison studies. GSEMT outperforms GSEA and GSNCA in the classification performance on an experiment dataset and an independent validation dataset. The results show GSEMT is a useful alternative for gene set enrichment analysis.
Subject
General Physics and Astronomy