Abstract
AbstractThis study presents an innovative approach for understanding the genetic underpinnings of two key phenotypes inSorghum bicolor: maximum canopy height and maximum growth rate. Genome-Wide Association Studies (GWAS) are widely used to decipher the genetic basis of traits in organisms, but the challenge lies in selecting an appropriate statistically significant threshold for analysis. Our goal was to employ GWAS to pinpoint the genetic markers associated with the phenotypes of interest using specific permissive-filtered threshold values that allows the inclusion of broader collections of explanatory candidate genes. Then, we utilized a pattern recognition technique to prioritize a set of informative genes, which hold potential for further investigation and could find applications in Artificial Intelligence systems. Utilizing a subset of the Sorghum Bioenergy Association Panel cultivated at the Maricopa Agricultural Center in Arizona, we sought to unveil patterns between phenotypic similarity and genetic proximity among accessions in order to organize Single Nucleotide Polymorphisms (SNPs) which are likely to be associated with the phenotypic trait. Additionally, we explored the impact of this method by considering all SNPs versus focusing on SNPs classified through the GWAS pre-filter. Experimental results indicated that our approach effectively prioritizes SNPs and genes influencing the phenotype of interest. Moreover, this methodology holds promise in the feature selection from genomic data for predicting complex phenotypic traits influenced by numerous genes and environmental conditions and could pave the way for further research in this field.Author SummaryAnalyzing the relationships of an organism’s phenotypes with their genotypes and environments stands as a critical aspect in plant biology. Over the past few decades, researchers have focused on identifying impactful genes controlling the plant phenotypes in diverse environmental conditions, a pursuit especially relevant in the face of global climate change. Applied field experiments and quantitative genetics have identified numerous regions of the genome influencing complex traits. However, the substantial time and cost required for physical experiments have prompted the use of computational methods to facilitate this field of research. Our work aims to unravel the patterns between phenotypic similarity and genetic proximity across a panel of sorghum accessions to prioritize genes influencing the phenotype of interest. Our results suggest that this method is an effective way to prioritize informative genes for further analysis, and for genomic features selection that can be used by machine learning and deep learning algorithms for predicting complex phenotypic traits.
Publisher
Cold Spring Harbor Laboratory