An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning-Reference-Cited by-同舟云学术

An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning

Published:2023-02-25 Issue:2 Volume:11 Page:245-275
ISSN:2194-668X
Container-title:Journal of the Operations Research Society of China
language:en
Short-container-title:J. Oper. Res. Soc. China

Author:

Guo Tian-De,Liu Yan,Han Cong-Ying

Abstract

AbstractNumerous intriguing optimization problems arise as a result of the advancement of machine learning. The stochastic first-order method is the predominant choice for those problems due to its high efficiency. However, the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate. Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems. This paper provides a review on recent developments in stochastic variants of quasi-Newton methods, which construct the Hessian approximations using only gradient information. We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios. Future research on stochastic quasi-Newton methods should focus on enhancing its applicability, lowering the computational and storage costs, and improving the convergence rate.

Funder

National Key R &D Program of China

National Natural Science Foundation of China

Natural Science Foundation of Tianjin

Publisher

Springer Science and Business Media LLC

Subject

Management Science and Operations Research

Link

https://link.springer.com/content/pdf/10.1007/s40305-023-00453-9.pdf

Reference149 articles.

1. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)

2. Le, Q., Ngiam, J., Coates, A., Lahiri, A., Prochnow, B., Ng, A.: On optimization methods for deep learning. In: International Conference on Machine Learning. pp. 265–272. ACM, New York, USA (2011)

3. Bottou, L.: Stochastic gradient learning in neural networks. In: Proceedings of Neuro-Nımes. 91 (1991)

4. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

5. Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. In: Neural Information Processing Systems. vol. 20. Curran Associates, Inc. (2007). https://proceedings.neurips.cc/paper/2007/file/0d3180d672e08b4c5312dcdafdf6ef36-Paper.pdf

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Intelligent Procurement Scheduling System for Items Involving Public Procurement;Applied System Innovation;2024-09-05

2. Controllable and scalable gradient-driven optimization design for two-dimensional metamaterials based on deep learning;Composite Structures;2024-06

3. Exploring Physics‐Informed Neural Networks for the Generalized Nonlinear Sine‐Gordon Equation;Applied Computational Intelligence and Soft Computing;2024-01