Generation, annotation, analysis and database integration of 16,500 white spruce EST clusters-Reference-Cited by-同舟云学术

Generation, annotation, analysis and database integration of 16,500 white spruce EST clusters

Published:2005-10-19 Issue:1 Volume:6 Page:
ISSN:1471-2164
Container-title:BMC Genomics
language:en
Short-container-title:BMC Genomics

Author:

Pavy Nathalie,Paule Charles,Parsons Lee,Crow John A,Morency Marie-Josee,Cooke Janice,Johnson James E,Noumen Etienne,Guillet-Claude Carine,Butterfield Yaron,Barber Sarah,Yang George,Liu Jerry,Stott Jeff,Kirkpatrick Robert,Siddiqui Asim,Holt Robert,Marra Marco,Seguin Armand,Retzel Ernest,Bousquet Jean,MacKay John

Abstract

Abstract Background The sequencing and analysis of ESTs is for now the only practical approach for large-scale gene discovery and annotation in conifers because their very large genomes are unlikely to be sequenced in the near future. Our objective was to produce extensive collections of ESTs and cDNA clones to support manufacture of cDNA microarrays and gene discovery in white spruce (Picea glauca [Moench] Voss). Results We produced 16 cDNA libraries from different tissues and a variety of treatments, and partially sequenced 50,000 cDNA clones. High quality 3' and 5' reads were assembled into 16,578 consensus sequences, 45% of which represented full length inserts. Consensus sequences derived from 5' and 3' reads of the same cDNA clone were linked to define 14,471 transcripts. A large proportion (84%) of the spruce sequences matched a pine sequence, but only 68% of the spruce transcripts had homologs in Arabidopsis or rice. Nearly all the sequences that matched the Populus trichocarpa genome (the only sequenced tree genome) also matched rice or Arabidopsis genomes. We used several sequence similarity search approaches for assignment of putative functions, including blast searches against general and specialized databases (transcription factors, cell wall related proteins), Gene Ontology term assignation and Hidden Markov Model searches against PFAM protein families and domains. In total, 70% of the spruce transcripts displayed matches to proteins of known or unknown function in the Uniref100 database (blastx e-value < 1e-10). We identified multigenic families that appeared larger in spruce than in the Arabidopsis or rice genomes. Detailed analysis of translationally controlled tumour proteins and S-adenosylmethionine synthetase families confirmed a twofold size difference. Sequences and annotations were organized in a dedicated database, SpruceDB. Several search tools were developed to mine the data either based on their occurrence in the cDNA libraries or on functional annotations. Conclusion This report illustrates specific approaches for large-scale gene discovery and annotation in an organism that is very distantly related to any of the fully sequenced genomes. The ArboreaSet sequences and cDNA clones represent a valuable resource for investigations ranging from plant comparative genomics to applied conifer genetics.

Publisher

Springer Science and Business Media LLC

Subject

Genetics,Biotechnology

Link

http://link.springer.com/content/pdf/10.1186/1471-2164-6-144.pdf

Reference71 articles.

1. Ahuja MR: Recent advances in molecular genetics of forest trees. Euphytica. 2001, 121: 173-195. 10.1023/A:1012226319449.

2. Dhillon SS: DNA in tree species. Cell and Tissue Culture in Forestry. Edited by: Bonga JM, Durzan DJ. 1987, Martinus Nijhoff Publishers, Dordrecht, 1: 298-313.

3. Wakamiya I, Newton RJ, Price JS: Genome size and environmental factors in the genus Pinus. Am J Bot. 1993, 80: 1235-1241.

4. Rake AW, Miksche JP, Hall RB, Hanson KM: DNA reassociation kinetics for four conifers. Can J Genet Cytol. 1980, 22: 69-79.

5. Ohri D, Khoshoo TN: Genome size in gymnosperms. Plant Syst Evol. 1986, 153: 119-132. 10.1007/BF00989421.

Cited by 105 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Quorum sensing bacteria improve microbial networks stability and complexity in wastewater treatment plants;Environment International;2024-05

2. Tree Improvement in Canada – past, present and future, 2023 and beyond;The Forestry Chronicle;2024-03

3. Transcriptome features of stone cell development in weevil‐resistant and susceptible Sitka spruce;New Phytologist;2023-07-04

4. Quorum sensing bacteria improve microbial networks stability and complexity in wastewater treatment plants;2023-06-22

5. Spruce giga‐genomes: structurally similar yet distinctive with differentially expanding gene families and rapidly evolving genes;The Plant Journal;2022-07-16