Non-asymptotic analysis of Stochastic approximation algorithms for streaming data-Reference-Cited by-同舟云学术

Non-asymptotic analysis of Stochastic approximation algorithms for streaming data

Published:2023 Issue: Volume:27 Page:482-514
ISSN:1262-3318
Container-title:ESAIM: Probability and Statistics
language:
Short-container-title:ESAIM: PS

Author:

Godichon-Baggioni Antoine,Werge Nicklas^ORCID,Wintenberger Olivier

Abstract

We introduce a streaming framework for analyzing stochastic approximation/optimization problems. This streaming framework is analogous to solving optimization problems using time-varying mini-batches that arrive sequentially. We provide non-asymptotic convergence rates of various gradientbased algorithms; this includes the famous Stochastic Gradient (SG) descent (a.k.a. Robbins-Monro algorithm), mini-batch SG and time-varying mini-batch SG algorithms, as well as their iterated averages (a.k.a. Polyak-Ruppert averaging). We show (i) how to accelerate convergence by choosing the learning rate according to the time-varying mini-batches, (ii) that Polyak-Ruppert averaging achieves optimal convergence in terms of attaining the Cramer-Rao lower bound, and (iii) how time-varying mini-batches together with Polyak-Ruppert averaging can provide variance reduction and accelerate convergence simultaneously, which is advantageous for many learning problems, such as online, sequential, and large-scale learning. We further demonstrate these favorable effects for various time-varying minibatches.

Publisher

EDP Sciences

Subject

Statistics and Probability

Link

https://www.esaim-ps.org/10.1051/ps/2023006/pdf

Reference40 articles.

1. Bach F. and Moulines E., Non-asymptotic analysis of stochastic approximation algorithms for machine learning. Adv. Neural Inf. Process. Syst. 24 (2011).

2. Bach F. and Moulines E., Non-strongly-convex smooth stochastic approximation with convergence rate O (1/n). Adv. Neural Inf. Process. Syst. 26 (2013).

3. Benveniste A., Metivier M. and Priouret P., Vol. 22 of Adaptive algorithms and stochastic approximations. Springer Science & Business Media (2012).

4. Optimization Methods for Large-Scale Machine Learning

5. Boyer C. and Godichon-Baggioni A., On the asymptotic rate of convergence of stochastic newton algorithms and their weighted averaged versions. Comput. Optim. Appl. (2022) 1-52.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An adaptive volatility method for probabilistic forecasting and its application to the M6 financial forecasting competition;International Journal of Forecasting;2024-06