1. Attention is all you need;vaswani;Proc Adv Neural Inf Process Syst,2017
2. BERT: Pretraining of deep bidirectional transformers for language understanding;devlin;Proc Conf North Amer Chapter Assoc Comput Linguistics Hum Lang Technol,2019
3. BERT, ELMo, USE and InferSent sentence encoders: The panacea for research-paper recommendation?;hassan;Proc 13th ACM Conf Recommender Syst,2019
4. Efficient estimation of word representations in vector space;mikolov;arXiv 1301 3781 [cs],2013
5. Adam: A method for stochastic optimization;kingma;arXiv 1412 6980,2014