Abstract
AbstractBackgroundHuntington’s disease (HD) is an autosomal dominant disease which is triggered by a large expansion of CAG nucleotides in theHTTgene. While the CAG expansion linearly correlates with the age of disease onset in HD, twin-studies and cohorts of Juvenile Onset HD (JOHD) patients have shown other factors influence the progression of HD. Thus, it would be of interest to identify molecular biomarkers which indicate predisposition to the development of HD, and as microRNAs (miRNAs) circulate in bio-fluids they would be particularly useful biomarkers. We explored a large HD miRNA-mRNA expression dataset (GSE65776) to establish appropriate questions that could be addressed using Machine Learning (ML). We sought sets of features (mRNAs or miRNAs) to predict HD or WT samples from aged or young mouse cortex samples, and we asked if a set of features could predict predisposition to HD or WT genotypes by training models on aged samples and testing the models on young samples. Several models were created using ADAboost, ExtraTrees, GaussianNB and Random Forest, and the best performing models were further analysed using AUC curves and PCA plots. Finally, genes used to train our miRNA-based predisposition model were mined from HD patient bio-fluid samples.ResultsOur testing accuracies were between 66-100% and AUC scores were between 31-100%. We generated several excellent models with testing accuracies >80% and AUC scores >90%. We also identified homologues ofmmu-miR-154-5p,mmu-miR-181a-5p,mmu-miR-212-3p, mmu-miR-378b, mmu-miR-382-5pandmmu-miR-770-5pfrom our miRNA-based predisposition model to be circulating in HD patient blood samples at p.values of <0.05.ConclusionsWe generated several age-based models which could differentiate between HD and WT samples, including an aged mRNA-based model with a 100% AUC score, an aged miRNA-based model with a 92% AUC score and an aged miRNA-based model with a 96% AUC score. We also identified several miRNAs used to train our miRNA-based predisposition model which were detectable in HD patient blood samples, which suggests they could be potential candidates for use as non-invasive biomarkers for HD research.
Publisher
Cold Spring Harbor Laboratory