1. Linguistic Knowledge and Transferability of Contextual Representations
2. Attention is All you Need;vaswani;Advances in neural information processing systems,2017
3. On Identifiability in Transformers;brunner;International Conference on Learning Representations,2020
4. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer;raffel;Journal of Machine Learning Research,2020