1. Guillaume Alain and Yoshua Bengio. 2016. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644 (2016). Guillaume Alain and Yoshua Bengio. 2016. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644 (2016).
2. Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014). Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
3. Michał Daniluk Tim Rocktäschel Johannes Welbl and Sebastian Riedel. 2017. Frustratingly short attention spans in neural language modeling. arXiv preprint arXiv:1702.04521 (2017). Michał Daniluk Tim Rocktäschel Johannes Welbl and Sebastian Riedel. 2017. Frustratingly short attention spans in neural language modeling. arXiv preprint arXiv:1702.04521 (2017).