GENCODE: reference annotation for the human and mouse genomes in 2023

Author:

Frankish Adam1ORCID,Carbonell-Sala Sílvia2,Diekhans Mark3ORCID,Jungreis Irwin45ORCID,Loveland Jane E1ORCID,Mudge Jonathan M1,Sisu Cristina67,Wright James C8,Arnan Carme2ORCID,Barnes If1,Banerjee Abhimanyu910,Bennett Ruth1,Berry Andrew1,Bignell Alexandra1,Boix Carles45,Calvet Ferriol2,Cerdán-Vélez Daniel11,Cunningham Fiona1ORCID,Davidson Claire1,Donaldson Sarah1,Dursun Cagatay612,Fatima Reham1,Giorgetti Stefano1,Giron Carlos Garcıa1ORCID,Gonzalez Jose Manuel1,Hardy Matthew1,Harrison Peter W1ORCID,Hourlier Thibaut1ORCID,Hollis Zoe1,Hunt Toby1,James Benjamin45,Jiang Yunzhe12,Johnson Rory1314ORCID,Kay Mike1,Lagarde Julien2,Martin Fergal J1ORCID,Gómez Laura Martínez11,Nair Surag910ORCID,Ni Pengyu612,Pozo Fernando11,Ramalingam Vivek910,Ruffier Magali1ORCID,Schmitt Bianca M1,Schreiber Jacob M910,Steed Emily1,Suner Marie-Marthe1ORCID,Sumathipala Dulika1,Sycheva Irina1,Uszczynska-Ratajczak Barbara15,Wass Elizabeth1,Yang Yucheng T616,Yates Andrew1ORCID,Zafrulla Zahoor910,Choudhary Jyoti S8,Gerstein Mark612,Guigo Roderic217,Hubbard Tim J P18,Kellis Manolis45,Kundaje Anshul910ORCID,Paten Benedict3ORCID,Tress Michael L11ORCID,Flicek Paul1ORCID

Affiliation:

1. European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Genome Campus, Hinxton, Cambridge  CB10 1SD, UK

2. Department of Bioinformatics and Genomics , Centre for Genomic Regulation (CRG) , The Barcelona Institute of Science andTechnology, Dr. Aiguader 88, Barcelona  08003, Catalonia, Spain

3. UC Santa Cruz Genomics Institute, University of California , Santa Cruz , CA 95064, USA

4. MIT Computer Science and Artificial Intelligence Laboratory , 32 Vassar St, Cambridge , MA  02139, USA

5. Broad Institute of MIT and Harvard , 415 Main Street , Cambridge , MA  02142, USA

6. Department of Molecular Biophysics and Biochemistry, Yale University , New Haven , CT  06520 , USA

7. Department of Life Sciences, Brunel University London , Uxbridge  UB8 3PH, UK

8. Functional Proteomics, Division of Cancer Biology, Institute of Cancer Research , 237 Fulham Road , London  SW3 6JB, UK

9. Department of Genetics, Stanford University , Palo Alto , CA , USA

10. Department of Computer Science, Stanford University , Palo Alto , CA , USA

11. Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO) , Calle Melchor Fernandez Almagro, 3, 28029 Madrid , Spain

12. Program in Computational Biology and Bioinformatics, Yale University , New Haven , CT  06520, USA

13. Department of Medical Oncology, Bern University Hospital , Murtenstrasse 35, 3008 Bern , Switzerland

14. School of Biology and Environmental Science, University College Dublin , Belfield, Dublin  4, D04 V1W8, Ireland

15. Computational Biology of Noncoding RNA, Institute of Bioorganic Chemistry, Polish Academy of Sciences , Noskowskiego 12/14, 61-704 Poznan , Poland

16. Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University , Shanghai  200433, China

17. Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra (UPF) , Barcelona , E-08003 Catalonia, Spain

18. Department of Medical and Molecular Genetics, King's College London, Guys Hospital , Great Maze Pond, London  SE1 9RT, UK

Abstract

Abstract GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.

Funder

National Institutes of Health

Wellcome Trust

European Molecular Biology Laboratory

Publisher

Oxford University Press (OUP)

Subject

Genetics

Cited by 44 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3