Budget Distributed Support Vector Machine for Non-ID Federated Learning Scenarios-Reference-Cited by-同舟云学术

Budget Distributed Support Vector Machine for Non-ID Federated Learning Scenarios

Published:2022-09-22 Issue:6 Volume:13 Page:1-25
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Navia-Vázquez A.¹^ORCID,Díaz-Morales R.²^ORCID,Fernández-Díaz M.²^ORCID

Affiliation:

1. Universidad Carlos III of Madrid, Madrid, Spain

2. Tree Technology, Llanera, Spain

Abstract

In recent years, there has been remarkable growth in Federated Learning (FL) approaches because they have proven to be very effective in training large Machine Learning (ML) models and also serve to preserve data confidentiality, as recommended by the GDPR or other business confidentiality restrictions that may apply. Despite the success of FL, performance is greatly reduced when data is not distributed identically (non-ID) across participants, as local model updates tend to diverge from the optimal global solution and thus the model averaging procedure in the aggregator is less effective. Kernel methods such as Support Vector Machines (SVMs) have not seen an equivalent evolution in the area of privacy preserving edge computing because they suffer from inherent computational, privacy and scalability issues. Furthermore, non-linear SVMs do not naturally lead to federated schemes, since locally trained models cannot be passed to the aggregator because they reveal training data (they are built on Support Vectors), and the global model cannot be updated at every worker using gradient descent. In this article, we explore the use of a particular controlled complexity (“Budget”) Distributed SVM (BDSVM) in the FL scenario with non-ID data, which is the least favorable situation, but very common in practice. The proposed BDSVM algorithm is as follows: model weights are broadcasted to workers, which locally update some kernel Gram matrices computed according to a common architectural base and send them back to the aggregator, which finally combines them, updates the global model, and repeats the procedure until a convergence criterion is met. Experimental results using synthetic 2D datasets show that the proposed method can obtain maximal margin decision boundaries even when the data is non-ID distributed. Further experiments using real-world datasets with non-ID data distribution show that the proposed algorithm provides better performance with less communication requirements than a comparable Multilayer Perceptron (MLP) trained using FedAvg. The advantage is more remarkable for a larger number of edge devices. We have also demonstrated the robustness of the proposed method against information leakage, membership inference attacks, and situations with dropout or straggler participants. Finally, in experiments run on separate processes/machines interconnected via the cloud messaging service developed in the context of the EU-H2020 MUSKETEER project, BDSVM is able to train better models than FedAvg in about half the time.

Funder

European Union’s Horizon 2020 research and innovation programme

FEDER/Ministerio de Ciencia, Innovación y Universidades - Agencia Estatal de Investigación

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3539734

Reference65 articles.

1. S. Bonura, D. Dalle Carbonare, R. Díaz-Morales, M. Fernández-Díaz, L. Morabito, L. Muñoz-González, C. Napione, A. Navia-Vázquez, and M. Purcell. 2020. Privacy preserving technologies for trusted data spaces. Technologies and Applications for Big Data Value. Big Data Value Association (BDVA).

2. Federated learning of predictive models from federated Electronic Health Records

3. C. Caragea, D. Caragea, and V. Honavar. 2005. Learning support vector machine classifiers from distributed data sources. In Proc. 20th National Conference on Artificial Intelligence.

4. Privacy-Preserved Federated Learning: A Survey of Applicable Machine Learning Algorithms in a Federated Environment;Carlsson R.;Ph.D. Dissertation, Uppsala University,2020

5. A Parallel Mixture of SVMs for Very Large Scale Problems

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SAFe‐Health: Guarding federated learning‐driven smart healthcare with federated defense averaging against data poisoning;SECURITY AND PRIVACY;2024-04-21

2. Efficient algorithmic coupling technique for precision recycling of seven types of mixed plastic waste;2024-03-29

3. Data Poisoning Attacks and Mitigation Strategies on Federated Support Vector Machines;SN Computer Science;2024-01-27

4. A Survey on Federated Learning Technology;Proceedings of the 2023 8th International Conference on Mathematics and Artificial Intelligence;2023-04-07

5. Quantitative risk analysis of treatment plans for patients with tumor by mining historical similar patients from electronic health records using federated learning;Risk Analysis;2023-03-11