Statistical Approach for Biologically Relevant Gene Selection from High-Throughput Gene Expression Data-Reference-Cited by-同舟云学术

Statistical Approach for Biologically Relevant Gene Selection from High-Throughput Gene Expression Data

Published:2020-10-25 Issue:11 Volume:22 Page:1205
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Das Samarendra^ORCID,Rai Shesh N.^ORCID

Abstract

Selection of biologically relevant genes from high-dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was conducted on a single high-dimensional expression data, which led to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining a support vector machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes were selected through statistical significance values and computed using a nonparametric test statistic under a bootstrap-based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e., subject classification, biological relevant criteria based on quantitative trait loci and gene ontology. Our analytical results showed that the proposed approach selects genes which are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter and wrapper methods of gene selection.

Publisher

MDPI AG

Subject

General Physics and Astronomy

Link

https://www.mdpi.com/1099-4300/22/11/1205/pdf

Reference56 articles.

1. High-Throughput Sequencing Technologies

2. DNA Microarrays: a Powerful Genomic Tool for Biomedical and Clinical Research

3. DNA Microarray

4. NCBI GEO: archive for functional genomics data sets—update

5. Statistical Approaches for Gene Selection, Hub Gene Identification and Module Interaction in Gene Co-Expression Network Analysis: An Application to Aluminum Stress in Soybean (Glycine max L.)

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Single-cell transcriptomics: background, technologies, applications, and challenges;Molecular Biology Reports;2024-04-30

2. UFODMV: Unsupervised Feature Selection for Online Dynamic Multi-Views;Applied Sciences;2023-03-28

3. A Framework for Comparison and Assessment of Synthetic RNA-Seq Data;Genes;2022-12-14

4. Five Years of Gene Networks Modeling in Single-cell RNA-sequencing Studies: Current Approaches and Outstanding Challenges;Current Bioinformatics;2022-12

5. Multigroup prediction in lung cancer patients and comparative controls using signature of volatile organic compounds in breath samples;PLOS ONE;2022-11-30