Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation

Author:

Wegrzyn Jill L1,Liechty John D1,Stevens Kristian A2,Wu Le-Shin3,Loopstra Carol A4,Vasquez-Gross Hans A1,Dougherty William M2,Lin Brian Y1,Zieve Jacob J1,Martínez-García Pedro J1,Holt Carson5,Yandell Mark5,Zimin Aleksey V6,Yorke James A67,Crepeau Marc W2,Puiu Daniela8,Salzberg Steven L8,de Jong Pieter J9,Mockaitis Keithanne10,Main Doreen11,Langley Charles H2,Neale David B1

Affiliation:

1. Department of Plant Sciences, University of California, Davis, California 95616

2. Department of Evolution and Ecology, University of California, Davis, California 95616

3. National Center for Genome Analysis Support, Indiana University, Bloomington, Indiana 47405

4. Department of Ecosystem Science and Management, Texas A&M University, College Station, Texas 77843

5. Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112

6. Institute for Physical Sciences and Technology, University of Maryland, College Park, Maryland 20742

7. Departments of Mathematics and Physics, University of Maryland, College Park, Maryland 20742

8. Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, The Johns Hopkins University, Baltimore, Maryland 21205

9. Children’s Hospital Oakland Research Institute, Oakland, California 94609

10. Department of Biology, Indiana University, Bloomington, Indiana 47405

11. Department of Horticulture, Washington State University, Pullman, Washington 99163

Abstract

Abstract The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20–40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.

Publisher

Oxford University Press (OUP)

Subject

Genetics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3