Affiliation:
1. College of Computer Science at Nankai University, China
2. College of Mathematics and Statistics Science at Ludong University, China
Abstract
Abstract
Although sifting functional genes has been discussed for years, traditional selection methods tend to be ineffective in capturing potential specific genes. First, typical methods focus on finding features (genes) relevant to class while irrelevant to each other. However, the features that can offer rich discriminative information are more likely to be the complementary ones. Next, almost all existing methods assess feature relations in pairs, yielding an inaccurate local estimation and lacking a global exploration. In this paper, we introduce multi-variable Area Under the receiver operating characteristic Curve (AUC) to globally evaluate the complementarity among features by employing Area Above the receiver operating characteristic Curve (AAC). Due to AAC, the class-relevant information newly provided by a candidate feature and that preserved by the selected features can be achieved beyond pairwise computation. Furthermore, we propose an AAC-based feature selection algorithm, named Multi-variable AUC-based Combined Features Complementarity, to screen discriminative complementary feature combinations. Extensive experiments on public datasets demonstrate the effectiveness of the proposed approach. Besides, we provide a gene set about prostate cancer and discuss its potential biological significance from the machine learning aspect and based on the existing biomedical findings of some individual genes.
Publisher
Oxford University Press (OUP)
Subject
Molecular Biology,Information Systems
Reference35 articles.
1. Local-nearest-neighbors-based feature weighting for gene selection;An;IEEE/ACM Trans Comput Biol Bioinform,2018
2. Benchmark of filter methods for feature selection in high-dimensional gene expression survival data;Andrea;Brief Bioinform,2021
3. The cancer genome atlas pan-cancer analysis project;Chang;Nat Genet,2013
4. Fast: a roc-based feature selection metric for small samples and imdddata classification problems;Chen,2008
5. Disentangling pten-cooperating tumor suppressor gene networks in cancer;de la Rosa;Mol Cell Oncol,2017
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献