A simple linear algebra identity to optimize large-scale neural network quantum states-Reference-Cited by-同舟云学术

A simple linear algebra identity to optimize large-scale neural network quantum states

Published:2024-08-02 Issue:1 Volume:7 Page:
ISSN:2399-3650
Container-title:Communications Physics
language:en
Short-container-title:Commun Phys

Author:

Rende Riccardo^ORCID,Viteritti Luciano Loris^ORCID,Bardone Lorenzo,Becca Federico,Goldt Sebastian^ORCID

Abstract

AbstractNeural-network architectures have been increasingly used to represent quantum many-body wave functions. These networks require a large number of variational parameters and are challenging to optimize using traditional methods, as gradient descent. Stochastic reconfiguration (SR) has been effective with a limited number of parameters, but becomes impractical beyond a few thousand parameters. Here, we leverage a simple linear algebra identity to show that SR can be employed even in the deep learning scenario. We demonstrate the effectiveness of our method by optimizing a Deep Transformer architecture with 3 × 105 parameters, achieving state-of-the-art ground-state energy in the J1–J2 Heisenberg model at J2/J1 = 0.5 on the 10 × 10 square lattice, a challenging benchmark in highly-frustrated magnetism. This work marks a significant step forward in the scalability and efficiency of SR for neural-network quantum states, making them a promising method to investigate unknown quantum phases of matter, where other methods struggle.

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s42005-024-01732-4.pdf

Reference65 articles.

1. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. in Advances in Neural Information Processing Systems Vol. 25 (eds Pereira, F. et al.) (Curran Associates, Inc., 2012). https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf.

2. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016).

3. Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 30 (Curran Associates, Inc., 2017).

4. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Association for Computational Linguistics, 2019).

5. Brown, T., Mann, B., Ryder, N. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems, Vol. 33, 1877–1901 (Curran Associates, Inc., 2020). https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Quantum skyrmion dynamics studied by neural network quantum states;Physical Review B;2024-09-05

2. Empowering deep neural quantum states through efficient optimization;Nature Physics;2024-07-01