Abstract
AbstractBackgroundGene expression regulates several complex traits observed. In this study, datasets comprising of transcriptome information and clinical traits regarding fat composition and vitals were analyzed via several statistical methods in order to find relations between genes and clinical outcomes.ResultsBiological big data is diverse and numerous, which makes for a complex case study and difficulties to stablish a metric. Histological data with semi-quantitative scores proved unreliable to correlate with other vitals, such as cholesterol composition, which complicates prediction of clinical outcomes. A composition of vitals, turned out to be a better variable for regression and factors for gene analysis. Several genes were found to be statistically significant after statistical analysis by ANOVA regarding the progressive categories of the preferred clinical variable.ConclusionsANOVA is proposed as a method for genetic information retrieval in order to extract biological meaning from RNA seq or microarray data, accounting for multiple classes of target variables. It Provides a reliable statistical method to associate genes or clusters of genes with particular traits.Supplementary informationSupplementary data are available in annexes.
Publisher
Cold Spring Harbor Laboratory
Reference10 articles.
1. Mathematizing Darwin
2. Integrating genetic and network analysis to characterize genes related to mouse weight;PLoS genetics,2006
3. Alboukadel Kassambara. rstatix: Pipe-Friendly Framework for Basic Statistical Tests. R package version 0.7.1. 2022. url: https://CRAN.R-project.org/package=rstatix.
4. Max Kuhn. caret: Classification and Regression Training. R package version 6.0-93. 2022. url: https://CRAN.R-project.org/package=caret.
5. Evaluation of time profile reconstruction from complex two-color microarray designs