1. S. Hochreiter, J. Schmidhuber, Long Short-Term Memory, Neural Computation 9(8) (1997) 1735–1780. doi: 10.1162/neco.1997.9.8.1735.
2. J.S. Cramer, Econometric Applications of Maximum Likelihood Methods, Cambridge University Press, 1986. doi: 10.1017/CBO9780511572050.
3. L. Yu, W. Zhang, J. Wang, Y. Yu, SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient, in: AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017, pp. 2852–2858. arXiv:1609.05473.
4. X. Chen, H. Fang, T.-Y. Lin, R. Vedantam, S. Gupta, P. Dollar, C.L. Zitnick, (2015) Microsoft COCO Captions: Data Collection and Evaluation Server. arXiv:1504.00325.
5. J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, J. Wang, (2018) Long Text Generation via Adversarial Training with Leaked Information, in: The Thirty-Two AAAI Conference on Artificial Intelligence. vol. 32. no. 1. pp. 5141-5148. arXiv:1709.08624.