Abstract
AbstractBackgroundConserved non coding Sequences (CNSs) are extensively studied for their regulatory properties and functional importance to organisms. Many features such as location, proximity to the likely target gene, lineage specificity, functionality of likely target genes, and nucleotide composition of these sequences have been investigated, thus have provided very meaningful insight to signify underlying evolutionary importance of these elements. Also thorough investigation around how to assign function to non-coding regions of eukaryote genomes is another area that is studied. On one hand evolutionary analyses, including signatures of selection or conservation which can indicate the presence of constraint, suggesting that sequences that are evolving non-neutrally are candidates for functionality. On the other hand evidence that is based on experimental profiling of transcription, methylation, histone modifications and chromatin state. While these types of data are very important and are associated with function in most cases, this is not always the case. Evolutionary conservation though highly conservative which mostly considers elements identifiable in more than one species, is still being used as the initial guideline in investigating function via experiments. If we had an understanding of the experimental profiles of conserved non-coding regions as there may be patterns that are often associated these potentially functional elements it may help to construed functionality of conserved non coding regions easily.ResultsIn an effort to try integrate experimental profile data, we investigated evidence of expression of conserved noncoding sequences (CNSs). For CNSs from ten primates, we assessed transcription, histone modifications, level of evolutionary constraint or accelerated evolution, and assessed possible target genes, tissue expression profiles of likely target genes (as some CNSs may be enhancers, and may be ncRNAs that interact directly with mRNA) and clustering patterns of CNSs. In total we found 153475 CNSs conserved across all ten primates. Of these 59,870 were overlapping non coding regions of ncRNA genes. H3K4Me1 marks (often associated with active enhancers) were highly correlated with CNSs whereas H4K20Me1 (linked to, e.g. DNA damage repair) had high correlation with conserved ncRNA regions (ncRNA-gene-CEs). Both CNSs and conserved ncRNA showed evidence of being under purifying selection. The CNSs in our dataset overall exhibited lower allele frequencies, consistent with higher levels of evolutionary constraint. We also found that CNSs and ncRNA-gene-CEs produce mutually exclusive groups. The analyses also suggest that both types of conserved elements have undergone waves of accelerated evolution, which we speculate may indicate changes in regulatory requirements following divergence events. Finally, we find that likely target genes for hominoidae, primate and mammalian-specific CNSs and ncRNA-gene-CEs are predominantly associated with brain-related function in humans.ConclusionThe deep conserved primate CNSs and ncRNA gene-CEs signify functional importance suggesting ongoing recruitment of these elements into brain-related functions, consistent with King and Wilson’s hypothesis that regulatory changes may account for rapid changes in phenotype among primates.
Publisher
Cold Spring Harbor Laboratory