Affiliation:
1. McGill University, 845 Sherbrooke St W, Montreal, Quebec H3A 0G4, Canada
2. INRIA
Abstract
Abstract
Machine learning brings the hope of finding new biomarkers extracted from cohorts with rich biomedical measurements. A good biomarker is one that gives reliable detection of the corresponding condition. However, biomarkers are often extracted from a cohort that differs from the target population. Such a mismatch, known as a dataset shift, can undermine the application of the biomarker to new individuals. Dataset shifts are frequent in biomedical research, e.g., because of recruitment biases. When a dataset shift occurs, standard machine-learning techniques do not suffice to extract and validate biomarkers. This article provides an overview of when and how dataset shifts break machine-learning–extracted biomarkers, as well as detection and correction strategies.
Funder
National Institutes of Health
Publisher
Oxford University Press (OUP)
Subject
Computer Science Applications,Health Informatics
Reference80 articles.
1. What are biomarkers?;Strimbu;Curr Opin HIV AIDS,2010
2. Big data for health;Andreu-Perez;IEEE J Biomed Health Inform,2015
3. Deep learning for healthcare applications based on physiological signals: A review;Faust;Comput Methods Programs Biomed,2018
4. Machine learning in medicine;Deo;Circulation,2015
5. FDA report on “Mammoscreen.";FDA,2020
Cited by
37 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献