Abstract
Deep learning-based algorithms have led to tremendous progress over the last years, but they face a bottleneck as their optimal development highly relies on access to large datasets. To mitigate this limitation, cross-silo federated learning has emerged as a way to train collaborative models among multiple institutions without having to share the raw data used for model training. However, although artificial intelligence experts have the expertise to develop state-of-the-art models and actively share their code through notebook environments, implementing a federated learning system in real-world applications entails significant engineering and deployment efforts. To reduce the complexity of federation setups and bridge the gap between federated learning and notebook users, this paper introduces a solution that leverages the Jupyter environment as part of the federated learning pipeline and simplifies its automation, the Notebook Federator. The feasibility of this approach is then demonstrated with a collaborative model solving a digital pathology image analysis task in which the federated model reaches an accuracy of 0.8633 on the test set, as compared to the centralized configurations for each institution obtaining 0.7881, 0.6514, and 0.8096, respectively. As a fast and reproducible tool, the proposed solution enables the deployment of a cross-country federated environment in only a few minutes.
Funder
European Union’s Horizon 2020 research and innovation programme with the project CLARIFY under Marie Sklodowska-Curie
ENVRI-FAIR
BlueCloud
ARTICONF
LifeWatch ERIC
Spanish Ministry of Economy and Competitiveness
Valencian Graduate School and Research Network for Artificial Intelligence & Generalitat Valenciana and Universitat Politècnica de València
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference26 articles.
1. Synthetic data in machine learning for medicine and healthcare;Chen;Nat. Biomed. Eng.,2021
2. Oza, P., Sharma, P., Patel, S., Adedoyin, F., and Bruno, A. (2022). Image Augmentation Techniques for Mammogram Analysis. J. Imaging, 8.
3. Estimating the success of re-identifications in incomplete datasets using generative models;Rocher;Nat. Commun.,2019
4. Konecný, J., McMahan, H.B., Ramage, D., and Richtárik, P. (2016). Federated Optimization: Distributed Machine Learning for On-Device Intelligence. arXiv.
5. Federated Learning: Challenges, Methods, and Future Directions;Li;IEEE Signal Process. Mag.,2020
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献