msRepDB: a comprehensive repetitive sequence database of over 80 000 species-Reference-Cited by-同舟云学术

msRepDB: a comprehensive repetitive sequence database of over 80 000 species

Published:2021-12-01 Issue:D1 Volume:50 Page:D236-D245
ISSN:0305-1048
Container-title:Nucleic Acids Research
language:en
Short-container-title:

Author:

Liao Xingyu¹²^ORCID,Hu Kang²,Salhi Adil¹,Zou You²,Wang Jianxin²^ORCID,Gao Xin¹^ORCID

Affiliation:

1. Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia

2. Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, P.R. China

Abstract

Abstract Repeats are prevalent in the genomes of all bacteria, plants and animals, and they cover nearly half of the Human genome, which play indispensable roles in the evolution, inheritance, variation and genomic instability, and serve as substrates for chromosomal rearrangements that include disease-causing deletions, inversions, and translocations. Comprehensive identification, classification and annotation of repeats in genomes can provide accurate and targeted solutions towards understanding and diagnosis of complex diseases, optimization of plant properties and development of new drugs. RepBase and Dfam are two most frequently used repeat databases, but they are not sufficiently complete. Due to the lack of a comprehensive repeat database of multiple species, the current research in this field is far from being satisfactory. LongRepMarker is a new framework developed recently by our group for comprehensive identification of genomic repeats. We here propose msRepDB based on LongRepMarker, which is currently the most comprehensive multi-species repeat database, covering >80 000 species. Comprehensive evaluations show that msRepDB contains more species, and more complete repeats and families than RepBase and Dfam databases. (https://msrepdb.cbrc.kaust.edu.sa/pages/msRepDB/index.html).

Funder

National Natural Science Foundation of China

King Abdullah University of Science and Technology

Hunan Provincial Natural Science Foundation

Hunan Provincial Science and Technology Program

111 Project

Publisher

Oxford University Press (OUP)

Subject

Genetics

Link

https://academic.oup.com/nar/article-pdf/50/D1/D236/42058092/gkab1089.pdf