Author:
Shen Jiquan,Shi Jiawei,Luo Junwei,Zhai Haixia,Liu Xiaoyan,Wu Zhengjiang,Yan Chaokun,Luo Huimin
Abstract
Abstract
Motivation
Studies have shown that classifying cancer subtypes can provide valuable information for a range of cancer research, from aetiology and tumour biology to prognosis and personalized treatment. Current methods usually adopt gene expression data to perform cancer subtype classification. However, cancer samples are scarce, and the high-dimensional features of their gene expression data are too sparse to allow most methods to achieve desirable classification results.
Results
In this paper, we propose a deep learning approach by combining a convolutional neural network (CNN) and bidirectional gated recurrent unit (BiGRU): our approach, DCGN, aims to achieve nonlinear dimensionality reduction and learn features to eliminate irrelevant factors in gene expression data. Specifically, DCGN first uses the synthetic minority oversampling technique algorithm to equalize data. The CNN can handle high-dimensional data without stress and extract important local features, and the BiGRU can analyse deep features and retain their important information; the DCGN captures key features by combining both neural networks to overcome the challenges of small sample sizes and sparse, high-dimensional features. In the experiments, we compared the DCGN to seven other cancer subtype classification methods using breast and bladder cancer gene expression datasets. The experimental results show that the DCGN performs better than the other seven methods and can provide more satisfactory classification results.
Funder
National Natural Science Foundation of China
Young Elite Teachers in Henan Province
Doctor Foundation of Henan Polytechnic University
Innovative and Scientific Research Team of Henan Polytechnic University
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference35 articles.
1. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–74.
2. Sun Y, Yao J, Yang L, Chen R, Nowak NJ, Goodison S. Computational approach for deriving cancer progression roadmaps from static sample data. Nucleic Acids Res. 2017;45(9):e69.
3. Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486(7403):346–52.
4. Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27(8):1160–7.
5. Chen R, Yang L, Goodison S, et al. Deep learning approach to identifying cancer subtypes using high-dimensional genomic data. Bioinformatics. 2019;36(5):1476–83.
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献