1. Martín Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , 2016 . {TensorFlow}: A System for {Large-Scale} Machine Learning . In 12th USENIX symposium on operating systems design and implementation (OSDI 16) . 265–283. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. {TensorFlow}: A System for {Large-Scale} Machine Learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265–283.
2. MPI on a Million Processors
3. SALaR: Scalable and Adaptive Designs for Large Message Reduction Collectives
4. A survey of MPI usage in the US exascale computing project
5. Accelerating distributed deep neural network training with pipelined MPI allreduce