Robustness and reproducibility for AI learning in biomedical sciences: RENOIR-Reference-Cited by-同舟云学术

Robustness and reproducibility for AI learning in biomedical sciences: RENOIR

Published:2024-01-22 Issue:1 Volume:14 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Barberis Alessandro,Aerts Hugo J. W. L.,Buffa Francesca M.

Abstract

AbstractArtificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR’s successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge—predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir.

Funder

Cancer Research UK

Prostate Cancer UK

European Research Council

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41598-024-51381-4.pdf

Reference32 articles.

1. Stephens, Z. D. et al. Big data: Astronomical or genomical?. PLoS Biol. 13, e1002195 (2015).

2. Marx, V. The big challenges of big data. Nature 498, 255–260 (2013).

3. Hornby, A. S., Deuter, M., Turnbull, J. & Bradbury, J. Oxford Advanced Learner’s Dictionary of Current English (Oxford University Press, 2015).

4. Begley, C. G. & Ellis, L. M. Raise standards for preclinical cancer research. Nature 483, 531–533 (2012).

5. Stupple, A., Singerman, D. & Celi, L. A. The reproducibility crisis in the age of digital medicine. Digit. Med. 2, 1–3 (2019).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identification and validation of a machine learning model of complete response to radiation in rectal cancer reveals immune infiltrate and TGFβ as key predictors;eBioMedicine;2024-08

2. Artificial intelligence for high content imaging in drug discovery;Current Opinion in Structural Biology;2024-08

3. The use of artificial intelligence in induced pluripotent stem cell-based technology over 10-year period: A systematic scoping review;PLOS ONE;2024-05-21