Abstract
Genomic selection (GS) is becoming increasingly widespread and applied due to the promising results obtained, cost savings in generating single nucleotide polymorphism (SNP) markers, and the development of statistical models that allow to improve the analysis robustness and accuracy. GS might shorten the selection cycle, which has a major impact, especially for perennial species. The composition and size of the training population have a major influence on GS, which poses challenges for interspecific biparental populations. Another factor is the use of different reference genomes from other species to perform SNP calling, which could make it possible to explore variability in interspecific crosses comprehensively. Late leaf rust is a disease caused by the pathogen Acculeastrum americanum, and the rare reports on genetic resistance to this pathogen are related to the species Rubus occidentalis, which leads to the need for interspecific hybridizations, aiming to combine the fruit quality of R. idaeus with the resistance of R. occidentalis. Given the above, we evaluated the effect of different reference genomes on the SNP markers discovery, as well as training population optimization (TPO) strategies on the accuracy of genomic predictions, namely the CV-α, leaving-one-family-out (LOFO), pairwise families, and stratified k-fold. The composition of the training set in a stratified manner, together with a matrix of markers combined with the reference genomes, increased the model's predictive capacity. These results corroborate that genomic prediction aligned with SNP calling and training population optimization strategies can significantly increase genetic gains in interspecific biparental crosses.