Author:
Guerraoui Rachid,Gupta Nirupam
Abstract
AbstractLarge Language Models (LLMs) have gained significant attention in recent years due to their potential to revolutionize various industries and sectors. However, scaling LLMs further requires access to substantial linguistic resources that are being rapidly depleted. Moreover, the available text sources such as emails, social media interactions, or internal documents may contain private information, making them susceptible to misuse. On-premises Federated Learning (FL) with privacy-preserving model updates is an alternative avenue for LLMs’ development that ensures data sovereignty and enables peers to collaborate while ensuring that the sensitive parts of their private data cannot be reconstructed. However, in the case of large-scale FL, there is also a risk of malicious users attempting to poison LLMs for their benefit. The problem of protecting the learning procedure against such users is known as Byzantine-robustness, and it is crucial to develop models that perform accurately despite faulty machines and poisonous data. Designing FL methods that are simultaneously privacy-preserving and Byzantine-robust is challenging. However, ongoing research suggests ways to incorporate the differentially-private Gaussian mechanism for privacy preservation and spectral robust-averaging for robustness. However, whether this approach applies to LLMs or whether a major player in the domain would emerge and capture all private information sources through network effects remains to be seen.
Publisher
Springer Nature Switzerland
Reference29 articles.
1. Li Li, Yuxi Fan, Mike Tse, and Kuo-Yi Lin. A review of applications in federated learning. Computers & Industrial Engineering, 149:106854, 2020.
2. Rachid Guerraoui, Nirupam Gupta, and Rafael Pinot. Byzantine machine learning: A primer. ACM Computing Surveys, 2023.
3. Peter Kairouz et al. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
4. Leslie Lamport, Robert Shostak, and Marshall Pease. The Byzantine generals problem. ACM Transactions on Programming Languages and Systems, 4(3):382–401, July 1982.
5. Moran Baruch, Gilad Baruch, and Yoav Goldberg. A little is enough: Circumventing defenses for distributed learning. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 8-14 December 2019, Long Beach, CA, USA, 2019.