1. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
2. F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” Neural computation, vol. 12, no. 10, pp. 2451-2471, 2000.
3. A. Vaswani et al., “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
4. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
5. J. Fernald and J. H. Rogers, “Puzzles in the Chinese stock market,” Review of Economics and Statistics, vol. 84, no. 3, pp. 416-432, 2002.