1. Neural machine translation by jointly learning to align and translate;Bahdanau,2015
2. A neural probabilistic language model;Bengio,2000
3. Term discrimination for text search tasks derived from negative binomial distribution;Bernauer;Information Processing and Management,2018
4. Semi-supervised sequence learning;Dai,2015
5. BERT: Pre-training of deep bidirectional transformers for language understanding;Devlin,2019