Strengths and limitations of non-disclosive data analysis: a comparison of breast cancer survival classifiers using VisualSHIELD

Author:

Tomasoni Danilo,Lombardo Rosario,Lauria Mario

Abstract

Preserving data privacy is an important concern in the research use of patient data. The DataSHIELD suite enables privacy-aware advanced statistical analysis in a federated setting. Despite its many applications, it has a few open practical issues: the complexity of hosting a federated infrastructure, the performance penalty imposed by the privacy-preserving constraints, and the ease of use by non-technical users. In this work, we describe a case study in which we review different breast cancer classifiers and report our findings about the limits and advantages of such non-disclosive suite of tools in a realistic setting. Five independent gene expression datasets of breast cancer survival were downloaded from Gene Expression Omnibus (GEO) and pooled together through the federated infrastructure. Three previously published and two newly proposed 5-year cancer-free survival risk score classifiers were trained in a federated environment, and an additional reference classifier was trained with unconstrained data access. The performance of these six classifiers was systematically evaluated, and the results show that i) the published classifiers do not generalize well when applied to patient cohorts that differ from those used to develop them; ii) among the methods we tried, the classification using logistic regression worked better on average, closely followed by random forest; iii) the unconstrained version of the logistic regression classifier outperformed the federated version by 4% on average. Reproducibility of our experiments is ensured through the use of VisualSHIELD, an open-source tool that augments DataSHIELD with new functions, a standardized deployment procedure, and a simple graphical user interface.

Publisher

Frontiers Media SA

Subject

Genetics (clinical),Genetics,Molecular Medicine

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3