1. Language models are few-shot learners;Brown;Adv. Neural Inf. Process. Syst.,2020
2. Smith, S., Patwary, M., Norick, B., LeGresley, P., Rajbhandari, S., Casper, J., Liu, Z., Prabhumoye, S., Zerveas, G., and Korthikanti, V. (2022). Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model. arXiv.
3. He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 12). Deep Residual Learning for Image Recognition. Proceedings of the the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.
4. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2010). An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale. arXiv.
5. Deep Learning for Safe Autonomous Driving: Current Challenges and Future Directions;Muhammad;IEEE Trans. Intell. Transp. Syst.,2021