Affiliation:
1. State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
2. Guizhou Minzu University, Guiyang 550025, China
3. Guizhou Big Data Academy, Guizhou University, Guiyang 550025, China
4. Key Laboratory of Advanced Manufacturing Technology, Ministry of Education, Guizhou University, Guiyang 550025, China
Abstract
Shared gradients are widely used to protect the private information of training data in distributed machine learning systems. However, Deep Leakage from Gradients (DLG) research has found that private training data can be recovered from shared gradients. The DLG method still has some issues such as the “Exploding Gradient,” low attack success rate, and low fidelity of recovered data. In this study, a Wasserstein DLG method, named WDLG, is proposed; the theoretical analysis shows that under the premise that the output layer of the model has a “bias” term, predicting the “label” of the data by whether the “bias” is “negative” or not is independent of the approximation of the shared gradient, and thus, the label of the data can be recovered with 100% accuracy. In the proposed method, the Wasserstein distance is used to calculate the error loss between the shared gradient and the virtual gradient, which improves model training stability, solves the “Exploding Gradient” phenomenon, and improves the fidelity of the recovered data. Moreover, a large learning rate strategy is designed to improve model training convergence speed in-depth. Finally, the WDLG method is validated on datasets from MNIST, Fashion MNIST, SVHN, CIFAR-100, and LFW. Experiments results show that the proposed WDLG method provides more stable updates for virtual data, a higher attack success rate, faster model convergence, higher image fidelity during recovery, and support for designing large learning rate strategies.
Funder
National Basic Research Program of China
Subject
Artificial Intelligence,Human-Computer Interaction,Theoretical Computer Science,Software
Reference57 articles.
1. Communication-efficient learning of deep networks from decentralized data;B. McMahan;Artificial intelligence and statistics PMLR,2017
2. Federated learning: strategies for improving communication efficiency;J. Konečný,2016
3. Federated Learning: Challenges, Methods, and Future Directions
4. Federated optimization in heterogeneous networks;T. Li;Proceedings of Machine Learning and Systems,2020
5. Advances and Open Problems in Federated Learning
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献