1. Mikolov T et al (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 1–9
2. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: 31st Int Conf Mach Learn ICML 2014, vol 4, pp 2931–2939
3. Devlin J et al (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019, vol 1, pp 4171–4186
4. Ramos J (2003) Using TF-IDF to determine word relevance in document queries. Proc First Instr Conf Mach Learn 242(1):29–48
5. Joulin A, Grave E et al (2017) Bag of tricks for efficient text classification. In: 15th Conference Eur Chapter Association Computer Linguist EACL 2017—Proceedings Conference, vol 2, pp 427–431