A roadmap for the functional annotation of protein families: a community perspective

Author:

de Crécy-lagard Valérie1ORCID,Amorin de Hegedus Rocio2,Arighi Cecilia3ORCID,Babor Jill1,Bateman Alex4ORCID,Blaby Ian5ORCID,Blaby-Haas Crysten6,Bridge Alan J7ORCID,Burley Stephen K8ORCID,Cleveland Stacey1,Colwell Lucy J9,Conesa Ana10ORCID,Dallago Christian11ORCID,Danchin Antoine12ORCID,de Waard Anita13ORCID,Deutschbauer Adam14,Dias Raquel1,Ding Yousong15ORCID,Fang Gang16,Friedberg Iddo17ORCID,Gerlt John18,Goldford Joshua19,Gorelik Mark1,Gyori Benjamin M20ORCID,Henry Christopher21,Hutinet Geoffrey1,Jaroch Marshall1,Karp Peter D22,Kondratova Liudmyla2,Lu Zhiyong23ORCID,Marchler-Bauer Aron23,Martin Maria-Jesus4,McWhite Claire24,Moghe Gaurav D25,Monaghan Paul26,Morgat Anne7,Mungall Christopher J14ORCID,Natale Darren A27,Nelson William C28,O’Donoghue Seán29,Orengo Christine30,O’Toole Katherine H31,Radivojac Predrag32ORCID,Reed Colbie1,Roberts Richard J31,Rodionov Dmitri33,Rodionova Irina A34ORCID,Rudolf Jeffrey D35,Saleh Lana31,Sheynkman Gloria36ORCID,Thibaud-Nissen Francoise23,Thomas Paul D37ORCID,Uetz Peter38,Vallenet David39ORCID,Carter Erica Watson40,Weigele Peter R31ORCID,Wood Valerie41ORCID,Wood-Charlson Elisha M14,Xu Jin40

Affiliation:

1. Department of Microbiology and Cell Sciences, University of Florida , Gainesville, FL 32611, USA

2. Genetics Institute, University of Florida , Gainesville, FL 32611, USA

3. Department of Computer and Information Sciences, University of Delaware , Newark, DE 19713, USA

4. European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus , Hinxton CB10 1SD, UK

5. US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA

6. Biology Department, Brookhaven National Laboratory , Upton, NY 11973, USA

7. Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire , Geneva 4 CH-1211, Switzerland

8. RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey , Piscataway, NJ 08854, USA

9. Departmenf of Chemistry, University of Cambridge , Lensfield Road, Cambridge CB2 1EW, UK

10. Spanish National Research Council, Institute for Integrative Systems Biology , Paterna, Valencia 46980, Spain

11. TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology , i12, Boltzmannstr. 3, Garching/Munich 85748, Germany

12. School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong , 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China

13. Research Collaboration Unit, Elsevier , Jericho, VT 05465, USA

14. Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA

15. Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida , Gainesville, FL 32610, USA

16. NYU-Shanghai , Shanghai 200120, China

17. Department of Veterinary Microbiology and Preventive Medicine, Iowa State University , Ames, IA 50011, USA

18. Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign , Urbana, IL 61801, USA

19. Physics of Living Systems, Massachusetts Institute of Technology , Cambridge, MA 02139, USA

20. Laboratory of Systems Pharmacology, Harvard Medical School , Boston, MA 02115, USA

21. Mathematics and Computer Science Division, Argonne National Laboratory , Argonne, IL 60439, USA

22. Bioinformatics Research Group, SRI International , Menlo Park, CA 94025, USA

23. National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH) , 8600 Rockville Pike, Bethesda, MD 20817, USA

24. Lewis-Sigler Institute for Integrative Genomics, Princeton University , Princeton, NJ 08540, USA

25. Plant Biology Section, School of Integrative Plant Science, Cornell University , Ithaca, NY 14853, USA

26. Department of Agricultural Education and Communication, University of Florida , Gainesville, FL 32611, USA

27. Georgetown University Medical Center , Washington, DC 20007, USA

28. Biological Sciences Division, Pacific Northwest National Laboratories , Richland, WA 99354, USA

29. School of Biotechnology and Biomolecular Sciences, University of NSW , Sydney, NSW 2052, Australia

30. Department of Structural and Molecular Biology, University College London , London WC1E 6BT, UK

31. New England Biolabs , Ipswich, MA 01938, USA

32. Khoury College of Computer Sciences, Northeastern University , Boston, MA 02115, USA

33. Sanford Burnham Prebys Medical Discovery Institute , La Jolla, CA 92037, USA

34. Department of Bioengineering, Division of Engineering, University of California at San Diego , La Jolla, CA 92093-0412, USA

35. Department of Chemistry, University of Florida , Gainesville, FL 32611, USA

36. Department of Molecular Physiology and Biological Physics, University of Virginia , Charlottesville, VA, USA

37. Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033, USA

38. Center for Biological Data Science, Virginia Commonwealth University , Richmond, VA 23284, USA

39. LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS , Evry 91057, France

40. Department of Plant Pathology, University of Florida Citrus Research and Education Center , 700 Experiment Station Rd., Lake Alfred, FL 33850, USA

41. Department of Biochemistry, University of Cambridge , Cambridge CB2 1GA, UK

Abstract

Abstract Over the last 25 years, biology has entered the genomic era and is becoming a science of ‘big data’. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3–4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.

Funder

Division of Molecular and Cellular Biosciences

U.S. National Library of Medicine

Publisher

Oxford University Press (OUP)

Subject

General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,Information Systems

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3