Abstract
Endogenous retroviruses (ERVs) represent genomic components of retroviral origin that are found integrated in the genomes of various species of vertebrates. These genomic elements have been widely characterized in model organisms and humans. However, composition and abundances of ERVs have not been categorized fully in all domestic animals. The advent of next generation sequencing technologies, development of bioinformatics tools, availability of genomic databases, and molecular cytogenetic techniques have revolutionized the exploration of the genome structure. Here, we investigated the nature, abundance, organization and assembly of ERVs and complete genomes of Jaagsiekte sheep retrovirus (JSRV) from high-throughput sequencing (HTS) data from two Iraqi domestic sheep breeds. We used graph-based read clustering (RepeatExplorer), frequency analysis of short motifs (k-mers), alignment to reference genome assemblies and fluorescent in situ hybridization (FISH). Three classes of ERVs were identified with the total genomic proportions of 0.55% from all analyzed whole genome sequencing raw reads, while FISH to ovine metaphase chromosomes exhibited abundant centromeric to dispersed distribution of these ERVs. Furthermore, the complete genomes of JSRV of two Iraqi sheep breeds were assembled and phylogenetically clustered with the known enJSRV proviruses in sheep worldwide. Characterization of partial and complete sequences of mammalian ERVs is valuable in providing insights into the genome landscape, to help with future genome assemblies, and to identify potential sources of disease when ERVs become active.