Abstract
AbstractCancer is a genetic disease where gene mutations are pivotal in disease initiation and pathophysiology. The gene expression profile follows a specific pattern exclusive to each cancer which can be utilized for early and accurate diagnosis. Microarray techniques have emerged as powerful tools capable of simultaneously capturing the expression profiles of thousands of genes. However, because of the high dimensionality of the produced transcriptome data, analysis of the resulting datasets is challenging. Recent advancements in Artificial Intelligence (AI) techniques like Machine Learning (ML) and Deep Learning can be instrumental in efficiently processing these high-dimensional datasets. LASSO-regression is a ML technique that can help to rank the features which could help in feature selection leading to dimensionality reduction. Deep Learning is one of the most sophisticated ML techniques that can process high-dimensional data owing to the presence of more number of hidden layers in its neural network. We designed a Deep Neural Network (DNN) classifier model fused with a LASSO-based significant feature extractor for classifying the gene expression dataset containing a total of 51 samples of which 24 samples are of lung cancer patients and the remaining 27 samples are of normal individuals. A LASSO regression model was implemented to identify the genes that played a significant role in the classification. These significant gene expressions were then fed into a convergent Deep Neural Architecture. The classifier was trained with 70% data and the rest 30% was used for validation. The proposed classifier proved to provide better classification as compared to LASSO regression and DNN used individually. The two classes were classified with an average accuracy of 96.25%, average precision of 99.67%, average specificity of 99.45% and average sensitivity of 91.73% measured over thirty independent assessments. In some cases, the model was able to obtain a classification accuracy of 100%. This could open the path to early and better diagnosis of cancers from transcriptome data.
Publisher
Cold Spring Harbor Laboratory