1. Brunner, G., Liu, Y., Pascual, D., Richter, O., Ciaramita, M., Wattenhofer, R.: On identifiability in transformers (2020)
2. Lecture Notes in Computer Science;A Câmara,2020
3. Dai, Z., Callan, J.: Deeper text understanding for IR with contextual neural language modeling. CoRR abs/1905.09217 (2019). http://arxiv.org/abs/1905.09217
4. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
5. Hofstätter, S., Althammer, S., Schröder, M., Sertkan, M., Hanbury, A.: Improving efficient neural ranking models with cross-architecture knowledge distillation (2020)