Foster thy young: Enhanced prediction of orphan genes in assembled genomes-Reference-Cited by-同舟云学术

Foster thy young: Enhanced prediction of orphan genes in assembled genomes

Published:2019-12-18 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Li Jing,Singh Urminder,Bhandary Priyanka,Campbell Jacqueline^ORCID,Arendsee Zebulun,Seetharam Arun S.^ORCID,Wurtele Eve Syrkin

Abstract

ABSTRACTProteins encoded by newly-emerged genes (“orphan genes”) share no sequence similarity with proteins in any other species. They provide organisms with a reservoir of genetic elements to quickly respond to changing selection pressures. Here, we systematically assess the ability of five gene annotation pipelines to accurately predict genes in genomes according to phylostratal origin. BRAKER and MAKER are existing, popular ab initio tools that infer gene structures by machine learning. Direct Inference is an evidence-based pipeline we developed to predict gene structures from alignments of RNA-Seq data. The BIND pipeline integrates ab initio predictions of BRAKER and Direct inference; MIND combines Direct Inference and MAKER predictions. We use highly-curated Arabidopsis and yeast annotations as gold-standard benchmarks, and cross-validate in rice. Each pipeline under-predicts orphan genes (as few as 11 percent, under one prediction scenario). Increasing RNA-Seq diversity greatly improves prediction efficacy. The combined methods (BIND and MIND) yield best predictions overall, BIND identifying 68% of annotated orphan genes and 99% of ancient genes in Arabidopsis. We provide a light weight, flexible, reproducible solution to improve gene prediction.

Publisher

Cold Spring Harbor Laboratory

Reference86 articles.

1. The evolutionary origin of orphan genes

2. Coming of age: orphan genes in plants

3. De novo gene birth;PLoS genetics,2019

4. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes;Elife,2020

5. Genetic novelty: How new genes are born;Elife,2020

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identification of novel PHD-finger genes in pepper by genomic re-annotation and comparative analyses;BMC Plant Biology;2022-04-20

2. The Streptochaeta Genome and the Evolution of the Grasses;Frontiers in Plant Science;2021-10-04

3. Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data;Frontiers in Genetics;2021-08-16

4. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes;Science;2021-08-06

5. The Streptochaeta genome and the evolution of the grasses;2021-06-08