Budget Distributed Support Vector Machine for Non-ID Federated Learning Scenarios

Author:

Navia-Vázquez A.1ORCID,Díaz-Morales R.2ORCID,Fernández-Díaz M.2ORCID

Affiliation:

1. Universidad Carlos III of Madrid, Madrid, Spain

2. Tree Technology, Llanera, Spain

Abstract

In recent years, there has been remarkable growth in Federated Learning (FL) approaches because they have proven to be very effective in training large Machine Learning (ML) models and also serve to preserve data confidentiality, as recommended by the GDPR or other business confidentiality restrictions that may apply. Despite the success of FL, performance is greatly reduced when data is not distributed identically (non-ID) across participants, as local model updates tend to diverge from the optimal global solution and thus the model averaging procedure in the aggregator is less effective. Kernel methods such as Support Vector Machines (SVMs) have not seen an equivalent evolution in the area of privacy preserving edge computing because they suffer from inherent computational, privacy and scalability issues. Furthermore, non-linear SVMs do not naturally lead to federated schemes, since locally trained models cannot be passed to the aggregator because they reveal training data (they are built on Support Vectors), and the global model cannot be updated at every worker using gradient descent. In this article, we explore the use of a particular controlled complexity (“Budget”) Distributed SVM (BDSVM) in the FL scenario with non-ID data, which is the least favorable situation, but very common in practice. The proposed BDSVM algorithm is as follows: model weights are broadcasted to workers, which locally update some kernel Gram matrices computed according to a common architectural base and send them back to the aggregator, which finally combines them, updates the global model, and repeats the procedure until a convergence criterion is met. Experimental results using synthetic 2D datasets show that the proposed method can obtain maximal margin decision boundaries even when the data is non-ID distributed. Further experiments using real-world datasets with non-ID data distribution show that the proposed algorithm provides better performance with less communication requirements than a comparable Multilayer Perceptron (MLP) trained using FedAvg. The advantage is more remarkable for a larger number of edge devices. We have also demonstrated the robustness of the proposed method against information leakage, membership inference attacks, and situations with dropout or straggler participants. Finally, in experiments run on separate processes/machines interconnected via the cloud messaging service developed in the context of the EU-H2020 MUSKETEER project, BDSVM is able to train better models than FedAvg in about half the time.

Funder

European Union’s Horizon 2020 research and innovation programme

FEDER/Ministerio de Ciencia, Innovación y Universidades - Agencia Estatal de Investigación

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Reference65 articles.

1. S. Bonura, D. Dalle Carbonare, R. Díaz-Morales, M. Fernández-Díaz, L. Morabito, L. Muñoz-González, C. Napione, A. Navia-Vázquez, and M. Purcell. 2020. Privacy preserving technologies for trusted data spaces. Technologies and Applications for Big Data Value. Big Data Value Association (BDVA).

2. Federated learning of predictive models from federated Electronic Health Records

3. C. Caragea, D. Caragea, and V. Honavar. 2005. Learning support vector machine classifiers from distributed data sources. In Proc. 20th National Conference on Artificial Intelligence.

4. Privacy-Preserved Federated Learning: A Survey of Applicable Machine Learning Algorithms in a Federated Environment;Carlsson R.;Ph.D. Dissertation, Uppsala University,2020

5. A Parallel Mixture of SVMs for Very Large Scale Problems

Cited by 5 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3