1. Hierarchical transformers for multi-document summarization;Liu,2019
2. Extractive text summarization using BERT;Patil,2022
3. Maximiliana Behnke, Kenneth Heafield, Losing heads in the lottery: Pruning transformer attention in neural machine translation, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, 2020.
4. A comparison of transformer and recurrent neural networks on multilingual neural machine translation;Lakew,2018
5. An image is worth 16x16 words: Transformers for image recognition at scale;Dosovitskiy,2020