1. Mahmoud Assran , Nicolas Loizou , Nicolas Ballas , and Mike Rabbat . 2019 . Stochastic gradient push for distributed deep learning . In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 344–353 . Mahmoud Assran, Nicolas Loizou, Nicolas Ballas, and Mike Rabbat. 2019. Stochastic gradient push for distributed deep learning. In Proceedings of the International Conference on Machine Learning (ICML). PMLR, 344–353.
2. Medha Atre Birendra Jha and Ashwini Rao. 2021. Distributed Deep Learning Using Volunteer Computing-Like Paradigm. arXiv preprint arXiv:2103.08894(2021). Medha Atre Birendra Jha and Ashwini Rao. 2021. Distributed Deep Learning Using Volunteer Computing-Like Paradigm. arXiv preprint arXiv:2103.08894(2021).
3. Understanding the role of individual units in a deep neural network
4. Oded Ben-David and Zohar Ringel. 2019. The role of a layer in deep neural networks: a Gaussian Process perspective. arXiv preprint arXiv:1902.02354(2019). Oded Ben-David and Zohar Ringel. 2019. The role of a layer in deep neural networks: a Gaussian Process perspective. arXiv preprint arXiv:1902.02354(2019).
5. Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165(2020). Tom B Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165(2020).