1. Adam: A method for stochastic optimization;kingma;ArXiv Preprint,2014
2. Bert: Pre-training of deep bidirectional transformers for language understanding;devlin;ArXiv Preprint,2018
3. Distributed representations of words and phrases and their compositionality;mikolov;Advances in neural information processing systems,2013
4. Did It Happen? The Pragmatic Complexity of Veridicality Assessment