1. Attention is all you need;Vaswani;Adv. Neural Inf. Process. Syst.,2017
2. Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv.
3. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2024, February 28). Improving Language Understanding by Generative Pre-Training. 2018. Preprint. Available online: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
4. Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A., and Eickhoff, C. (2021, January 14–18). A transformer-based framework for multivariate time series representation learning. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual.
5. Inceptiontime: Finding alexnet for time series classification;Lucas;Data Min. Knowl. Discov.,2020