Hands-on training about overfitting-Reference-Cited by-同舟云学术

Hands-on training about overfitting

Published:2021-03-04 Issue:3 Volume:17 Page:e1008671
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Demšar Janez,Zupan Blaž^ORCID

Abstract

Overfitting is one of the critical problems in developing models by machine learning. With machine learning becoming an essential technology in computational biology, we must include training about overfitting in all courses that introduce this technology to students and practitioners. We here propose a hands-on training for overfitting that is suitable for introductory level courses and can be carried out on its own or embedded within any data science course. We use workflow-based design of machine learning pipelines, experimentation-based teaching, and hands-on approach that focuses on concepts rather than underlying mathematics. We here detail the data analysis workflows we use in training and motivate them from the viewpoint of teaching goals. Our proposed approach relies on Orange, an open-source data science toolbox that combines data visualization and machine learning, and that is tailored for education in machine learning and explorative data analysis.

Publisher

Public Library of Science (PLoS)

Subject

Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics

Reference14 articles.

1. Machine learning in bioinformatics;P Larrañaga;Brief Bioinform,2006

2. Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities;M Zitnik;Information Fusion,2019

3. A few useful things to know about machine learning;P Domingos;Commun ACM.,2012

4. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification;R Simon;J Natl Cancer Inst,2003

5. Ten quick tips for machine learning in computational biology;D Chicco;BioData Mining,2017

Cited by 43 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Peripheral blood DNA methylation signatures predict response to vedolizumab and ustekinumab in adult patients with Crohn’s disease: The EPIC-CD study;2024-07-25

2. A generalized model of cardiac surface motion for evaluating left anterior descending coronary artery dose in left breast cancer radiotherapy;Medical Physics;2024-06-22

3. Investigation of the myopic outcomes of the newer intraocular lens power calculation formulas in Korean patients with long eyes;Scientific Reports;2024-05-31

4. Slope deformation prediction based on noise reduction and deep learning: a point prediction and probability analysis method;Frontiers in Earth Science;2024-05-27

5. Artificial-Intelligence-Enhanced Analysis of In Vivo Confocal Microscopy in Corneal Diseases: A Review;Diagnostics;2024-03-26