Abstract
AbstractRecently authors have introduced the idea of training discrete weights neural networks using a mix between classical simulated annealing and a replica ansatz known from the statistical physics literature. Among other points, they claim their method is able to find robust configurations. In this paper, we analyze this so called “replicated simulated annealing” algorithm. In particular, we give criteria to guarantee its convergence, and study when it successfully samples from configurations. We also perform experiments using synthetic and real data bases.
Funder
Deutsche Forschungsgemeinschaft
Publisher
Springer Science and Business Media LLC
Subject
Mathematical Physics,Statistical and Nonlinear Physics
Reference34 articles.
1. Azencott, R.: Simulated annealing. Astérisque, (161–162): Exp. No. 697, 5, 223–237 (1989), 1988. Séminaire Bourbaki, vol. 1987/88
2. Baldassi, C., Borgs, C., Chayes, J.T., Ingrosso, A., Lucibello, C., Saglietti, L., Zecchina, R.: Unreasonable effectiveness of learning neural networks: from accessible states and robust ensembles to basic algorithmic schemes. Proc. Natl. Acad. Sci. 113(48), E7655–E7662 (2016)
3. Bhatnagar, N., Randall, D.: Torpid mixing of simulated tempering on the Potts model. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 478–487 (electronic), New York. ACM (2004)
4. Biroli, G., Cammarota, C., Ricci-Tersenghi, F.: How to iron out rough landscapes and get optimal performances: averaged gradient descent and its application to tensor PCA. J. Phys. A 53(17), 174003 (2020)
5. Catoni, O.: Rough large deviation estimates for simulated annealing: application to exponential schedules. Ann. Probab. 20(3), 1109–1146 (1992)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献