1. Alvi A, Kharya P (2021) Using DeepSpeed and megatron to train megatron-turing NLG 530B, the world’s largest and most powerful generative language model. Available at: https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/ (accessed 23 October 2021).
2. Anderson C (2008) The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine. Available at: https://www.wired.com/2008/06/pb-theory/(accessed 25 October 2021).
3. Learning Deep Architectures for AI
4. Best J (2013) IBM watson: The inside story of how the jeopardy-winning supercomputer was born, and what it wants to do next. Available at: https://www.techrepublic.com/article/ibm-watson-the-inside-story-of-how-the-jeopardy-winning-supercomputer-was-born-and-what-it-wants-to-do-next/ (accessed 15 August 2021).