tidysdm: Leveraging the flexibility of tidymodels for species distribution modelling in R

Author:

Leonardi Michela1ORCID,Colucci Margherita12ORCID,Pozzi Andrea Vittorio1ORCID,Scerri Eleanor M. L.234ORCID,Manica Andrea1ORCID

Affiliation:

1. Evolutionary Ecology Group, Department of Zoology University of Cambridge Cambridge UK

2. Human Palaeosystems Research Group Max Planck Institute of Geoanthropology Jena Jena Germany

3. Department of Classics and Archaeology University of Malta Msida Malta

4. Institute of Prehistoric Archaeology University of Cologne Cologne Germany

Abstract

Abstract In species distribution modelling (SDM), it is common practice to explore multiple machine learning (ML) algorithms and combine their results into ensembles. In R, many implementations of different ML algorithms are available but, as they were mostly developed independently, they often use inconsistent syntax and data structures. For this reason, repeating an analysis with multiple algorithms and combining their results can be challenging. Specialised SDM packages solve this problem by providing a simpler, unified interface by wrapping the original functions to tackle each specific requirement. However, creating and maintaining such interfaces is time‐consuming, and with this approach, the user cannot easily integrate other methods that may become available. Here, we present tidysdm, an R package that solves this problem by taking advantage of the tidymodels universe. tidymodels provide standardised grammar, data structures and modelling interfaces, and a well‐documented infrastructure to integrate new algorithms and metrics. The wide adoption of tidymodels means that most ML algorithms and metrics are already integrated, and the user can add additional ones. Moreover, because of the broad adoption of tidymodels, new statistical approaches tend to be implemented quickly, making them easily integrated into existing pipelines and analyses. tidysdm takes advantage of the tidymodels universe to provide a flexible and fully customisable pipeline to fit SDM. It includes SDM‐specific algorithms and metrics, and methods to facilitate the use of spatial data within tidymodels. Additionally, tidysdm is the first software that natively allows SDM to be performed using data from different periods, expanding the availability of SDM for scholars working in palaeontology, archaeology, palaeobiology, palaeoecology and other disciplines focussing on the past.

Funder

Natural Environment Research Council

Leverhulme Trust

Publisher

Wiley

Reference37 articles.

1. Ensemble forecasting of species distributions

2. Spatial filtering to reduce sampling bias can improve the performance of ecological niche models

3. Chamberlain S. Barve V. Mcglinn D. Oldoni D. Desmet P. Geffert L. &Ram K.(2024).rgbif: Interface to the global biodiversity information facility API. R package version 3.7.9.3.https://CRAN.R‐project.org/package=rgbif

4. Couch S. &Kuhn M.(2024).Stacks: Tidy model stacking.https://stacks.tidymodels.org/ https://github.com/tidymodels/stacks

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3