1. Acceleration of Stochastic Approximation by Averaging
2. Lingvo: A modular and scalable framework for sequence-to-sequence modeling;shen,2019
3. Speech recognition with deep recurrent neural networks
4. Deliberation networks: Sequence generation beyond one-pass decoding;xia;Advances in neural information processing systems,2017
5. Tied & Reduced RNN-T Decoder