1. wav2vec 2.0: A framework for self-supervised learning of speech representations;Baevski Alexei;Advances in Neural Information Processing Systems,2020
2. Zhihao Bai , Zhen Zhang , Yibo Zhu , and Xin Jin . 2020 . PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI '20) . Zhihao Bai, Zhen Zhang, Yibo Zhu, and Xin Jin. 2020. PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI '20).
3. Mandeep Baines Shruti Bhosale Vittorio Caggiano Naman Goyal Siddharth Goyal Myle Ott Benjamin Lefaudeux Vitaliy Liptchinsky Mike Rabbat Sam Sheiffer Anjali Sridhar and Min Xu. 2021. FairScale: A general purpose modular PyTorch library for high performance and large scale training. https://github.com/facebookresearch/fairscale. Mandeep Baines Shruti Bhosale Vittorio Caggiano Naman Goyal Siddharth Goyal Myle Ott Benjamin Lefaudeux Vitaliy Liptchinsky Mike Rabbat Sam Sheiffer Anjali Sridhar and Min Xu. 2021. FairScale: A general purpose modular PyTorch library for high performance and large scale training. https://github.com/facebookresearch/fairscale.
4. Edmon Begoli , Seung-Hwan Lim , and Sudarshan Srinivasan . 2021 . Performance Profile of Transformer Fine-Tuning in Multi-GPU Cloud Environments. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 3095--3100 . Edmon Begoli, Seung-Hwan Lim, and Sudarshan Srinivasan. 2021. Performance Profile of Transformer Fine-Tuning in Multi-GPU Cloud Environments. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 3095--3100.
5. Tom Brown , Benjamin Mann , Nick Ryder , Melanie Subbiah , Jared D Kaplan , Prafulla Dhariwal , Arvind Neelakantan , Pranav Shyam , Girish Sastry , Amanda Askell , Sandhini Agarwal , Ariel Herbert-Voss , Gretchen Krueger , Tom Henighan , Rewon Child , Aditya Ramesh , Daniel Ziegler , Jeffrey Wu , Clemens Winter , Chris Hesse , Mark Chen , Eric Sigler , Mateusz Litwin , Scott Gray , Benjamin Chess , Jack Clark , Christopher Berner , Sam McCandlish , Alec Radford , Ilya Sutskever , and Dario Amodei . 2020 . Language Models are Few-Shot Learners . In Advances in Neural Information Processing Systems (NeurIPS '20) . Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems (NeurIPS '20).