1. Pipelined Backpropagation at Scale: Training Large Models without Batches;kosson;MLSys,2021
2. Gap-aware Mitigation of Gradient Staleness;barkai;ICLRE,2020
3. Taming Momentum in a Distributed Asynchronous Environmen-t;hakimi,2019
4. Asynchronous Accelerated Stochastic Gradient Descen-t;meng;IJCAI,2016