Performance Analysis of the AdaGrad Family of Algorithms-Reference-Cited by-同舟云学术

Performance Analysis of the AdaGrad Family of Algorithms

Published:2023 Issue: Volume: Page:121-132
ISSN:2367-3370
Container-title:Lecture Notes in Networks and Systems
language:
Short-container-title:

Author:

Bagepalli Abhishek^ORCID,Singh Sanjay^ORCID

Publisher

Springer Nature Singapore

Link

https://link.springer.com/content/pdf/10.1007/978-981-99-4284-8_10

Reference12 articles.

1. Ahn K, Zhang J, Sra S (2022) Understanding the unstable convergence of gradient descent. In: Chaudhuri K, Jegelka S, Song L, Szepesvári C, Niu G, Sabato S (eds) International conference on machine learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA. Proceedings of machine learning research, vol 162. PMLR, pp 247–257. https://proceedings.mlr.press/v162/ahn22a.html

2. Antonakopoulos K, Mertikopoulos P, Piliouras G, Wang X (2022) AdaGrad avoids saddle points. In: Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S (eds) Proceedings of the 39th international conference on machine learning. Proceedings of machine learning research, vol 162. PMLR (17–23 Jul 2022), pp 731–771. https://proceedings.mlr.press/v162/antonakopoulos22a.html

3. Cazenave T, Sentuc J, Videau M (2022) Cosine annealing, mixnet and swish activation for computer go. In: Browne C, Kishimoto A, Schaeffer J (eds) Advances in computer games. Springer International Publishing, Cham, pp 53–60

4. Duchi JC, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159. https://doi.org/10.5555/1953048.2021068

5. Ghojogh B, Ghojogh A, Crowley M, Karray F (2019) Fitting a mixture distribution to data: tutorial. arXiv:1901.06708