HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors

Author:

Vorontsov Ilya E1ORCID,Eliseeva Irina A2,Zinkevich Arsenii13,Nikonov Mikhail3,Abramov Sergey14,Boytsov Alexandr14,Kamenets Vasily156,Kasianova Alexandra78,Kolmykov Semyon9,Yevshin Ivan S10,Favorov Alexander111,Medvedeva Yulia A12ORCID,Jolma Arttu13,Kolpakov Fedor914ORCID,Makeev Vsevolod J156ORCID,Kulakovskiy Ivan V1215ORCID

Affiliation:

1. Vavilov Institute of General Genetics, Russian Academy of Sciences , 119991  Moscow , Russia

2. Institute of Protein Research, Russian Academy of Sciences , 142290  Pushchino , Russia

3. Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University , 119991  Moscow , Russia

4. Altius Institute for Biomedical Sciences , 98121  Seattle , WA , USA

5. Moscow Institute of Physics and Technology , 141700  Dolgoprudny , Russia

6. Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences , 450054  Ufa , Russia

7. Skolkovo Institute of Science and Technology , 121205  Moscow , Russia

8. Institute for Information Transmission Problems of the Russian Academy of Sciences , 127051  Moscow , Russia

9. Department of Computational Biology, Sirius University of Science and Technology , 354340  Sirius , Krasnodar region, Russia

10. Biosoft.Ru LLC , 630090  Novosibirsk , Russia

11. Johns Hopkins University School of Medicine , Baltimore , MD 21205 , USA

12. Research Center of Biotechnology RAS, Russian Academy of Sciences , 119071  Moscow , Russia

13. Donnelly Centre, University of Toronto , Toronto , Ontario  M5S 3E1 , Canada

14. Bioinformatics Laboratory, Federal Research Center for Information and Computational Technologies , 630090  Novosibirsk , Russia

15. Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University , 420008  Kazan , Russia

Abstract

Abstract We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.

Funder

Russian Science Foundation

Non-commercial Foundation for Support of Science and Education ‘INTELLECT’

Ministry of Science and Higher Education of the Russian Federation

Government of the Russian Federation

Publisher

Oxford University Press (OUP)

Subject

Genetics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3