Affiliation:
1. School of Computer Science & Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning Province 116024, P. R. China
Abstract
Identifying valuable features from complex omics data is of great significance for disease diagnosis study. This paper proposes a new feature selection algorithm based on sample network (FS-SN) to mine important information from omics data. The sample network is constructed according to the sample neighbor relationship at the molecular (feature) expression level, and the distinguishing ability of the feature is evaluated based on the topology of the sample network. The sample network established on a feature with a strong discriminating ability tends to have many edges between the same group samples and few edges between the different group samples. At the same time, FS-SN removes redundant features according to the gravitational interaction between features. To show the validation of FS-SN, it was compared on ten public datasets with ERGS, mRMR, ReliefF, ATSD-DN, and INDEED which are efficient in omics data analysis. Experimental results show that FS-SN performed better than the compared methods in accuracy, sensitivity and specificity in most cases. Hence, FS-SN making use of the topology of the sample network is effective for analyzing omics data, it can identify key features that reflect the occurrence and development of diseases, and reveal the underlying biological mechanism.
Funder
the AI S&T Program from Yulin Branch, Dalian National Laboratory for Clean Energy, CAS, China
Publisher
World Scientific Pub Co Pte Ltd
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献