Affiliation:
1. Department of Biology, Clark University, Worcester, Massachusetts, USA
Abstract
ABSTRACT
De novo
genes are very important for evolutionary innovation. However, how these genes originate and spread remains largely unknown. To better understand this, we rigorously searched for
de novo
genes in
Saccharomyces cerevisiae
S288C and examined their spread and fixation in the population. Here, we identified 84
de novo
genes in
S. cerevisiae
S288C since the divergence with their sister groups. Transcriptome and ribosome profiling data revealed at least 8 (10%) and 28 (33%)
de novo
genes being expressed and translated only under specific conditions, respectively. DNA microarray data, based on 2-fold change, showed that 87% of the
de novo
genes are regulated during various biological processes, such as nutrient utilization and sporulation. Our comparative and evolutionary analyses further revealed that some factors, including single nucleotide polymorphism (SNP)/indel mutation, high GC content, and DNA shuffling, contribute to the birth of
de novo
genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we also provide evidence suggesting the possible parallel origin of a
de novo
gene between
S. cerevisiae
and
Saccharomyces paradoxus
. Together, our study provides several new insights into the origin and spread of
de novo
genes.
IMPORTANCE
Emergence of
de novo
genes has occurred in many lineages during evolution, but the birth, spread, and function of these genes remain unresolved. Here we have searched for
de novo
genes from
Saccharomyces cerevisiae
S288C using rigorous methods, which reduced the effects of bad annotation and genomic gaps on the identification of
de novo
genes. Through this analysis, we have found 84 new genes originating
de novo
from previously noncoding regions, 87% of which are very likely involved in various biological processes. We noticed that 10% and 33% of
de novo
genes were only expressed and translated under specific conditions, therefore, verification of
de novo
genes through transcriptome and ribosome profiling, especially from limited expression data, may underestimate the number of bona fide new genes. We further show that SNP/indel mutation, high GC content, and DNA shuffling could be involved in the birth of
de novo
genes, while domestication and natural selection drive the spread and fixation of these genes. Finally, we provide evidence suggesting the possible parallel origin of a new gene.
Publisher
American Society for Microbiology