1. Ali, A., Renals, S.: Word error rate estimation for speech recognition: e-WER. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 20–24 (2018)
2. Akahori, W., Hirai, T., Morishima, S.: Dynamic subtitle placement considering the region of interest and speaker location. In: Proceedings of 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2017)
3. Apone, T., Botkin, B., Brooks, M., Goldberg, L.: Research into automated error ranking of real-time captions in live television news programs. Caption accuracy metrics project. National Center for Accessible Media (2011). http://ncam.wgbh.org/file_download/136
4. Apone, T., Botkin, B., Brooks, M., Goldberg, L.: Caption accuracy metrics project. Research into automated error ranking of real-time captions in live television news programs. The Carl and Ruth Shapiro Family National Center for Accessible Media at WGBH (NCAM) (2011)
5. Berke, L., Albusays, K., Seita, M., Huenerfauth, M.: Preferred appearance of captions generated by automatic speech recognition for deaf and hard-of-hearing viewers. In: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI EA 2019), p. 6. ACM, New York (2019). Paper LBW1713. https://doi.org/10.1145/3290607.3312921