1. Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools;Mayer;ACM Computing Surveys (CSUR),2020
2. Deep speech 2: End-to-end speech recognition in english and mandarin;Amodei,2016
3. Densely connected convolutional networks;Huang,2017
4. J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
5. M.E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep contextualized word representations, arXiv preprint arXiv:1802.05365 (2018).