1. Deep Learning with Python;Chollet,2021
2. Attention is all you need;Vaswani;Adv Neural Inf Process Syst,2017
3. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. (2020): Language models are few-shot learners. arXiv https://doi.org/10.48550/arXiv.2005.14165
4. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the Conference;Devlin,2019