Abstract
ABSTRACTA large proportion of genes in mammalian genomes are embedded within another gene, sharing nucleotide sequence. Characterisation of these host/nested gene pairs is required to understand their transcriptional crosstalk. We identify all host/nested gene pairs and reveal they are found at over a sixth of genes in both human and mouse. Host genes were more highly expressed and more likely to be protein-coding and their nested gene partners mainly reside within the largest intron of their host. Individual analysis of host or nested gene expression did not reveal tissue specific profiles. Co-expression profiles did reveal tissue specific profiles, suggesting host/nested gene crosstalk plays a role during differentiation and development. To assess true co-expression or mutual expression testis scRNA-seq data were used, revealing that pairs can switch expression dynamically between these states during spermatogenesis. Host genes have a larger pool of isoforms and when co-expression occurred at some testis-specific pairs, host transcript diversity increased. Scanning experimentally validated polyadenylation sites upstream of the nested gene, shows that host polyadenylation sites are enriched, implying that alternative polyadenylation is one of the mechanisms influenced by host/nested gene co-expression. The collection of host/nested genes and our analysis is available on an Rshiny application.
Publisher
Cold Spring Harbor Laboratory