Importance of timely metadata curation to the global surveillance of genetic diversity

Author:

Crandall Eric D.1ORCID,Toczydlowski Rachel H.2ORCID,Liggins Libby3ORCID,Holmes Ann E.4ORCID,Ghoojaei Maryam5ORCID,Gaither Michelle R.5ORCID,Wham Briana E.6ORCID,Pritt Andrea L.7ORCID,Noble Cory3ORCID,Anderson Tanner J.8ORCID,Barton Randi L.910ORCID,Berg Justin T.11ORCID,Beskid Sofia G.12ORCID,Delgado Alonso13ORCID,Farrell Emily5ORCID,Himmelsbach Nan14ORCID,Queeno Samantha R.8ORCID,Trinh Thienthanh5ORCID,Weyand Courtney15ORCID,Bentley Andrew16ORCID,Deck John17ORCID,Riginos Cynthia18ORCID,Bradburd Gideon S.2ORCID,Toonen Robert J.19ORCID

Affiliation:

1. Department of Biology Pennsylvania State University University Park Pennsylvania USA

2. Ecology, Evolution, and Behavior Program, Department of Integrative Biology Michigan State University East Lansing Michigan USA

3. School of Natural Sciences Massey University Auckland New Zealand

4. Department of Animal Science University of California, Davis Davis California USA

5. Department of Biology University of Central Florida Orlando Florida USA

6. Department of Research Informatics and Publishing, The Pennsylvania State University Libraries Pennsylvania State University University Park Pennsylvania USA

7. Madlyn L. Hanes Library The Pennsylvania State University Libraries Pennsylvania State University Middletown Pennsylvania USA

8. Department of Anthropology University of Oregon Eugene Oregon USA

9. Department of Marine Science California State University Monterey Bay Seaside California USA

10. Moss Landing Marine Laboratories Moss Landing California USA

11. UOG Marine Laboratory University of Guam Mangilao Guam

12. Department of Integrative Biology University of Texas at Austin Austin Texas USA

13. Department of Evolution, Ecology, and Organismal Biology The Ohio State University Columbus Ohio USA

14. Department of Natural Science Hawai‘i Pacific University Honolulu Hawaii USA

15. Department of Biological Sciences Auburn University Auburn Alabama USA

16. Biodiversity Institute University of Kansas Lawrence Kansas USA

17. Berkeley Natural History Museums University of California, Berkeley Berkeley California USA

18. School of Biological Sciences The University of Queensland Brisbane Queensland Australia

19. Hawai‘i Institute of Marine Biology University of Hawai‘i at Mānoa Kaneohe Hawaii USA

Abstract

AbstractGenetic diversity within species represents a fundamental yet underappreciated level of biodiversity. Because genetic diversity can indicate species resilience to changing climate, its measurement is relevant to many national and global conservation policy targets. Many studies produce large amounts of genome‐scale genetic diversity data for wild populations, but most (87%) do not include the associated spatial and temporal metadata necessary for them to be reused in monitoring programs or for acknowledging the sovereignty of nations or Indigenous peoples. We undertook a distributed datathon to quantify the availability of these missing metadata and to test the hypothesis that their availability decays with time. We also worked to remediate missing metadata by extracting them from associated published papers, online repositories, and direct communication with authors. Starting with 848 candidate genomic data sets (reduced representation and whole genome) from the International Nucleotide Sequence Database Collaboration, we determined that 561 contained mostly samples from wild populations. We successfully restored spatiotemporal metadata for 78% of these 561 data sets (n = 440 data sets with data on 45,105 individuals from 762 species in 17 phyla). Examining papers and online repositories was much more fruitful than contacting 351 authors, who replied to our email requests 45% of the time. Overall, 23% of our email queries to authors unearthed useful metadata. The probability of retrieving spatiotemporal metadata declined significantly as age of the data set increased. There was a 13.5% yearly decrease in metadata associated with published papers or online repositories and up to a 22% yearly decrease in metadata that were only available from authors. This rapid decay in metadata availability, mirrored in studies of other types of biological data, should motivate swift updates to data‐sharing policies and researcher practices to ensure that the valuable context provided by metadata is not lost to conservation science forever.

Funder

National Science Foundation

Publisher

Wiley

Subject

Nature and Landscape Conservation,Ecology,Ecology, Evolution, Behavior and Systematics

Cited by 8 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3