Author:
Zhu Yuelin,Stephens Robert M,Meltzer Paul S,Davis Sean R
Abstract
Abstract
Background
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Illumina (Genome Analyzer, HiSeq, MiSeq, .etc), Roche 454 GS System, Applied Biosystems SOLiD System, Helicos Heliscope, PacBio RS, and others.
Results
SRAdb is an attempt to make queries of the metadata associated with SRA submission, study, sample, experiment and run more robust and precise, and make access to sequencing data in the SRA easier. We have parsed all the SRA metadata into a SQLite database that is routinely updated and can be easily distributed. The SRAdb R/Bioconductor package then utilizes this SQLite database for querying and accessing metadata. Full text search functionality makes querying metadata very flexible and powerful. Fastq files associated with query results can be downloaded easily for local analysis. The package also includes an interface from R to a popular genome browser, the Integrated Genomics Viewer.
Conclusions
SRAdb Bioconductor package provides a convenient and integrated framework to query and access SRA metadata quickly and powerfully from within R.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference5 articles.
1. The NCBI Sequence Read Archive[http://www.ncbi.nlm.nih.gov/sra]
2. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2012. [. [ISBN 3-900051-07-0] http://www.R-project.org/] []. [ISBN 3-900051-07-0]
3. Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004,5(10):R80. 10.1186/gb-2004-5-10-r80
4. Robinson J, Thorvaldsdóttir H, Winckler W, Guttman M, Lander E, Getz G, Mesirov J: Integrative genomics viewer. Nat Biotechnol 2011, 29: 24-26. 10.1038/nbt.1754
5. James DA, Falcon S: RSQLite: SQLite interface for R. 2012. . [R package version 0.11.2] http://CRAN.R-project.org/package=RSQLite . [R package version 0.11.2]
Cited by
107 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献