Towards a Deep Learning-Powered Herbarium Image Analysis Platform

Author:

Sklab Youcef,Ariouat HananeORCID,Boujydah Youssef,Qacami Yassine,Prifti EdiORCID,Zucker Jean-daniel,Vignes Lebbe RégineORCID,Chenin Eric

Abstract

Global digitization efforts have archived millions of specimen scans worldwide in herbarium collections, which are essential for studying plant evolution and biodiversity. ReColNat hosts, at present, over 10 million images. However, analyzing these datasets poses crucial challenges for botanical research. The application of deep learning in biodiversity analyses, particularly in analyzing herbarium scans, has shown promising results across numerous tasks (Ariouat et al. 2023, Ariouat et al. 2024, Groom et al. 2023, Sahraoui et al. 2023). Within the e-Col+project (ANR-21-ESRE-0053), we are developing multiple deep learning models aimed at identifying plant morphological traits. We have developed pipelines and models for cleaning, analyzing, and transforming herbarium images, including models for: i) detecting non-vegetal elements, such as barcodes, envelopes, labels, etc.; ii) detecting plant organs, including leaves, flowers, fruits, etc.; and iii) segmenting to recognize plant parts for image cleaning. We are also developing models for classification tasks related to various morphological traits. To validate these models, improve their generalization, and make them easily usable by end-users, deploying them within a generic platform is crucial. The generic platform called PlantAI, currently under development by the e-Col+ project, should enable easy deployment during development for testing and allow users to load annotations for new traits in order to train a model and add it to the existing catalog. The platform is based on a microservice architecture, allowing users to upload images, create custom datasets, and access various AI models for image analysis. The platform is composed of four main modules, as illustrated in Fig. 1. The first module is the collaborative workspace manager, which allows users to create projects and image datasets and invite other users to collaborate on a project. The second module is the navigation interface and dashboards. This module integrates a search engine using metadata and AI annotations, a navigation interface between projects, datasets, and specimens, as well as dashboards for analysis across datasets, specimens, and AI models. The third module is the dataset manager, which handles metadata and annotations associated with the specimens. These annotations can be produced either by expert users or by AI models. The fourth module is the AI models management module, so that models can be used to generate AI annotations of specimen. During the development lifecycle of an AI model, users can create datasets and annotate them with AI models. These annotations can be in two possible states: validated by experts and non-validated. Users collaborating on a project can indicate errors in the model predictions and leave comments to explain their evaluations. These corrections made by experts can be used to retrain the models and thus improve their performance. This platform, will be highly beneficial for botanists, enhancing the efficiency and effectiveness of biodiversity analyses from herbarium scans. We aim to provide users with a catalog of AI models through this platform and allow them to import their own datasets with their own annotations regarding traits of their choice. Users will be able to select a model from the AI model catalog and train it using their dataset. Ultimately, the model obtained from this training will be automatically deployed to be available for AI annotation. The annotations produced by this model will be automatically available in the filtering and navigation interface, thus allowing for dynamic and automatic integration of the AI annotations into the navigation interface.

Publisher

Pensoft Publishers

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3