FloraTraiter: Automated parsing of traits from descriptive biodiversity literature-Reference-Cited by-同舟云学术

FloraTraiter: Automated parsing of traits from descriptive biodiversity literature

Published:2024-01 Issue:1 Volume:12 Page:
ISSN:2168-0450
Container-title:Applications in Plant Sciences
language:en
Short-container-title:Appl Plant Sci

Author:

Folk Ryan A.¹^ORCID,Guralnick Robert P.²³^ORCID,LaFrance Raphael T.²

Affiliation:

1. Department of Biological Sciences Mississippi State University Mississippi State Mississippi USA

2. Florida Museum of Natural History University of Florida Gainesville Florida USA

3. Biodiversity Institute University of Florida Gainesville Florida USA

Abstract

AbstractPremisePlant trait data are essential for quantifying biodiversity and function across Earth, but these data are challenging to acquire for large studies. Diverse strategies are needed, including the liberation of heritage data locked within specialist literature such as floras and taxonomic monographs. Here we report FloraTraiter, a novel approach using rule‐based natural language processing (NLP) to parse computable trait data from biodiversity literature.MethodsFloraTraiter was implemented through collaborative work between programmers and botanical experts and customized for both online floras and scanned literature. We report a strategy spanning optical character recognition, recognition of taxa, iterative building of traits, and establishing linkages among all of these, as well as curational tools and code for turning these results into standard morphological matrices.ResultsOver 95% of treatment content was successfully parsed for traits with <1% error. Data for more than 700 taxa are reported, including a demonstration of common downstream uses.ConclusionsWe identify strategies, applications, tips, and challenges that we hope will facilitate future similar efforts to produce large open‐source trait data sets for broad community reuse. Largely automated tools like FloraTraiter will be an important addition to the toolkit for assembling trait data at scale.

Publisher

Wiley

Reference30 articles.

1. Developmental and biophysical determinants of grass leaf size worldwide

2. Author inflation masks global capacity for species discovery in flowering plants

3. A TRAIT-BASED TEST FOR HABITAT FILTERING: CONVEX HULL VOLUME

4. What we (don't) know about global plant diversity