ActDES – a Curated Actinobacterial Database for Evolutionary Studies

Author:

Schniete Jana K.ORCID,Selem-Mojica NellyORCID,Birke Anna S.ORCID,Cruz-Morales PabloORCID,Hunter Iain S.ORCID,Barona-Gómez FranciscoORCID,Hoskisson Paul A.ORCID

Abstract

AbstractActinobacteria are a large and diverse phylum of bacteria that contains medically and ecologically relevant organisms. Many members are valuable sources of bioactive natural products and chemical precursors that are exploited in the clinic. These are made using the enzyme pathways encoded in their complex genomes. Whilst the number of sequenced genomes has increased rapidly in the last twenty years, the large size and complexity of many Actinobacterial genomes means that the sequences remain incomplete and consist of large numbers of contigs with poor annotation, which hinders large scale comparative genomics and evolutionary studies. To enable greater understanding and exploitation of Actinobacterial genomes, specialist genomic databases must be linked to high-quality genome sequences. Here we provide a curated database of 612 high-quality actinobacterial genomes from 80 genera, chosen to represent a broad phylogenetic group with equivalent genome reannotation. Utilising this database will provide researchers with a framework for evolutionary and metabolic studies, to enable a foundation for genome and metabolic engineering, to facilitate discovery of novel bioactive therapeutics and studies on gene family evolution.Significance as a bioresource to the communityThe Actinobacteria are a large diverse phylum of bacteria, often with large, complex genomes with a high G+C content. Sequence databases have great variation in the quality of sequences, equivalence of annotation and phylogenetic representation, which makes it challenging to undertake evolutionary and phylogenetic studies. To address this, we have assembled a curated, taxa-specific, non-redundant database to aid detailed comparative analysis of Actinobacteria. ActDES constitutes a novel resource for the community of Actinobacterial researchers that will be useful primarily for two types of analyses: (i) comparative genomic studies – facilitated by reliable identification of orthologs across a set of defined, phylogenetically-representative genomes, and (ii) phylogenomic studies which will be improved by identification of gene subsets at specified taxonomic level. These analyses can then act as a springboard for the studies of the evolution of virulence genes, the evolution of metabolism and identification of targets for metabolic engineering.Data summaryAll genome sequences used in this study can be found in the NCBI taxonomy browser https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/www.tax.cgi and are summarised along with Accession numbers in Table S1All other data are available on Figshare https://doi.org/10.6084/m9.figshare.12167529 and https://doi.org/10.5281/zenodo.3830391Perl script files available on GitHub https://github.com/nselem/ActDES including details of how to batch annotate genomes in RAST from the terminal https://github.com/nselem/myrastSupp. Table S1 List of genomes from NCBI (Actinobacteria database.xlsx) https://doi.org/10.6084/m9.figshare.12167529CVS genome annotation files including the FASTA files of nucleotide and amino acids sequences (individual .cvs files) https://doi.org/10.6084/m9.figshare.12167880BLAST nucleotide database (.fasta file) https://doi.org/10.6084/m9.figshare.12167724BLAST protein database (.fasta file) https://doi.org/10.6084/m9.figshare.12167724Supp. Table S2 Expansion table genus level (Expansion table.xlsx Tab Genus level) https://doi.org/10.6084/m9.figshare.12167529Supp. Table S2 Expansion table species level (Expansion table.xlsx Tab species level) https://doi.org/10.6084/m9.figshare.12167529All GlcP and Glk data – blast hits from ActDES database, MUSCLE Alignment files and .nwk tree files can be found at https://doi.org/10.6084/m9.figshare.12167529Interactive trees in Microreact for Glk tree https://microreact.org/project/w_KDfn1xA/90e6759e and associated files can be found at https://doi.org/10.6084/m9.figshare.12326441.v1Interactive trees in Microreact for GlcP tree https://microreact.org/project/VBUdiQ5_k/0fc4622b and associated files can be found at https://doi.org/10.6084/m9.figshare.12326441.v1

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3