Abstract
ABSTRACTWheat (Triticum aestivumL.) is crucial to global food security, but is often threatened by diseases, pests, and environmental stresses. Wheat stem sawfly (Cephus cinctusNorton) poeses a major threat to food security in the United States, and solid-stem varieties, which carry the stem-solidness locus (Sst1), are the main source of genetic resistance against sawfly. Marker-assisted selection uses molecular markers to identify lines possessing beneficial haplotypes, like that of theSst1locus. In this study, an R package titled "HaploCatcher" was developed to predict specific haplotypes of interest in genome-wide genotyped lines. A training population of 1,056 lines genotyped for theSst1locus, known to confer stem solidness, and genome-wide markers was curated to make predictions of theSst1haplotypes for 292 lines from the Colorado State University wheat breeding program. PredictedSst1haplotypes were compared to marker derived haplotypes. Our results indicated that the training set was substantially predictive, with kappa scores of 0.83 for k-nearest neighbors and 0.88 for random forest models. Forward validation on newly developed breeding lines demonstrated that a random forest model, trained on the total available training data, had comparable accuracy between forward and cross-validation. Estimated group means of lines classified by haplotypes from PCR-derived markers and predictive modeling did not significantly differ. The HaploCatcher package is freely available and may be utilized by breeding programs, using their own training populations, to predict haplotypes for whole genome sequenced early generation material.CORE IDEASIdentification, introgression, and frequency increase of large effect loci are important for cultivar development.TheSst1locus has a significant effect on cutting score in fields exposed to sawfly infestation.Historical genetic information can be utilized to predict haplotypes for lines which have genome-wide genetic data.An R package, HaploCatcher, has been developed to facilitate this analysis in other programs.
Publisher
Cold Spring Harbor Laboratory