Decentralized and parallel primal and dual accelerated methods for stochastic convex programming problems

Author:

Dvinskikh Darina1,Gasnikov Alexander2

Affiliation:

1. Weierstrass Institute for Applied Analysis and Stochastics , Mohrenstr. 39, 10117 Berlin , Germany ; and Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia; and Institute for Information Transmission Problems RAS, Moscow, Russia

2. Department of Control and Applied Mathematics , Moscow Institute of Physics and Technology , 9 Institutskiy pereulok , Dolgoprudny , Moscow Region, 141701 , Russia ; and Institute for Information Transmission Problems RAS, Moscow, Russia; and Weierstrass Institute for Applied Analysis and Stochastics, Berlin, Germany

Abstract

Abstract We introduce primal and dual stochastic gradient oracle methods for decentralized convex optimization problems. Both for primal and dual oracles, the proposed methods are optimal in terms of the number of communication steps. However, for all classes of the objective, the optimality in terms of the number of oracle calls per node takes place only up to a logarithmic factor and the notion of smoothness. By using mini-batching technique, we show that the proposed methods with stochastic oracle can be additionally parallelized at each node. The considered algorithms can be applied to many data science problems and inverse problems.

Funder

Russian Science Foundation

Russian Foundation for Basic Research

Ministry of Science and Higher Education of the Russian Federation

Publisher

Walter de Gruyter GmbH

Subject

Applied Mathematics

Reference106 articles.

1. Z. Allen-Zhu, Katyusha: The first direct acceleration of stochastic gradient methods, STOC’17—Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, ACM, New York (2017), 1200–1205.

2. Z. Allen-Zhu, How to make the gradients small stochastically: Even faster convex and nonconvex SGD, Advances in Neural Information Processing Systems 31 (NeurIPS 2018), Neural Information Processing Systems Foundation, San Diego (2018), 1157–1167.

3. Z. Allen-Zhu and E. Hazan, Optimal black-box reductions between optimization objectives, Advances in Neural Information Processing Systems 29 (NeurIPS 2016), Neural Information Processing Systems Foundation, San Diego (2016), 1614–1622.

4. A. S. Anikin, A. V. Gasnikov, P. E. Dvurechensky, A. I. Tyurin and A. V. Chernov, Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints, Comput. Math. Math. Phys. 57 (2017), no. 8, 1262–1276.

5. Y. Arjevani and O. Shamir, Communication complexity of distributed convex learning and optimization, Advances in Neural Information Processing Systems 28 (NeurIPS 2015), Neural Information Processing Systems Foundation, San Diego (2015), 1756–1764.

Cited by 17 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3