Towards provably efficient quantum algorithms for large-scale machine-learning models-Reference-Cited by-同舟云学术

Towards provably efficient quantum algorithms for large-scale machine-learning models

Published:2024-01-10 Issue:1 Volume:15 Page:
ISSN:2041-1723
Container-title:Nature Communications
language:en
Short-container-title:Nat Commun

Author:

Liu Junyu^ORCID,Liu Minzhao^ORCID,Liu Jin-Peng^ORCID,Ye Ziyu,Wang Yunfei,Alexeev Yuri,Eisert Jens^ORCID,Jiang Liang^ORCID

Abstract

AbstractLarge machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as

$${{{{{{{\mathcal{O}}}}}}}}({T}^{2}\times {{{{{{{\rm{polylog}}}}}}}}(n))$$

O ( T 2 × polylog ( n ) ) , where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.

Funder

Deutsche Forschungsgemeinschaft

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41467-023-43957-x.pdf

Reference40 articles.

1. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

2. Johnson, K. OpenAI debuts DALL-E for generating images from text. VentureBeat (2021).

3. Brown, T. et al. Language models are few-shot learners. Adv. Neur. Inf. Process Sys. 33, 1877–1901 (2020).

4. Roose, K. The brilliance and weirdness of ChatGPT. The New York Times (2022).

5. Lewkowycz, A. et al. Solving quantitative reasoning problems with language models. NeurIPS. https://openreview.net/forum?id=IFXTZERXdM7 (2022).

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-view hypergraph regularized Lp norm least squares twin support vector machines for semi-supervised learning;Pattern Recognition;2024-12

2. Dense outputs from quantum simulations;Journal of Computational Physics;2024-10

3. Deep Quantum-Transformer Networks for Multimodal Beam Prediction in ISAC Systems;IEEE Internet of Things Journal;2024-09-15

4. Deep-learning-based quantum algorithms for solving nonlinear partial differential equations;Physical Review A;2024-08-14

5. Multi-participant quantum anonymous communication based on high-dimensional entangled states;Physica Scripta;2024-08-12