A FAIR and modular image‐based workflow for knowledge discovery in the emerging field of imageomics

Author:

Balk Meghan A.1ORCID,Bradley John2ORCID,Maruf M.3ORCID,Altintaş Bahadir45ORCID,Bakiş Yasin5ORCID,Bart Henry L.5ORCID,Breen David6ORCID,Florian Christopher R.1ORCID,Greenberg Jane7ORCID,Karpatne Anuj3ORCID,Karnani Kevin6ORCID,Mabee Paula1ORCID,Pepper Joel6ORCID,Jebbia Dom8ORCID,Tabarin Thibault1ORCID,Wang Xiaojun9ORCID,Lapp Hilmar2ORCID

Affiliation:

1. National Ecological Observatory Network Battelle Memorial Institute Boulder Colorado USA

2. Department of Biostatistics & Bioinformatics Duke University School of Medicine Durham North Carolina USA

3. Department of Computer Science Virginia Polytechnic Institute and State University Blacksburg Virginia USA

4. Department of Mathematics and Science Education Bolu Abant Izzet Baysal University Bolu Turkey

5. Department of Ecology and Evolutionary Biology, School of Science and Engineering Tulane University New Orleans Louisiana USA

6. Department of Computer Science Drexel University Philadelphia Pennsylvania USA

7. Metadata Research Center, College of Computing & Informatics Drexel University Philadelphia Pennsylvania USA

8. Carnegie Mellon University Libraries Pittsburgh Pennsylvania USA

9. Biodiversity Research Institute Tulane University New Orleans Louisiana USA

Abstract

Abstract Image‐based machine learning tools are an ascendant ‘big data’ research avenue. Citizen science platforms, like iNaturalist, and museum‐led initiatives provide researchers with an abundance of data and knowledge to extract. These include extraction of metadata, species identification, and phenomic data. Ecological and evolutionary biologists are increasingly using complex, multi‐step processes on data. These processes often include machine learning techniques, often built by others, that are difficult to reuse by other members in a collaboration. We present a conceptual workflow model for machine learning applications using image data to extract biological knowledge in the emerging field of imageomics. We derive an implementation of this conceptual workflow for a specific imageomics application that adheres to FAIR principles as a formal workflow definition that allows fully automated and reproducible execution, and consists of reusable workflow components. We outline technologies and best practices for creating an automated, reusable and modular workflow, and we show how they promote the reuse of machine learning models and their adaptation for new research questions. This conceptual workflow can be adapted: it can be semi‐automated, contain different components than those presented here, or have parallel components for comparative studies. We encourage researchers—both computer scientists and biologists—to build upon this conceptual workflow that combines machine learning tools on image data to answer novel scientific questions in their respective fields.

Funder

National Science Foundation

Publisher

Wiley

Reference83 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3