1. 1. Ardila, R., Branson, M., Davis, K., Henretty, M., Kohler, M., Meyer, J., Morais, R., Saunders, L., Tyers, F.M., Weber, G.: Common voice: A massively-multilingual speech corpus. In: ..12th Int. Conf. on Language Resources and Evaluation, .pp. 4218-4222. (2020)
2. 2. Baevski, A., Zhou, H., Mohamed, A., Auli, M.: wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv. Neural Inf. Process. Syst., pp.1-19 (2020)
3. 3. Dhawan, K., Rekesh, Kd., Ginsburg, B.: Unified Model for Code-Switching Speech Recognition and Language Identification Based on Concatenated Tokenizer. In: Winata, G., Kar, S., Zhukova, M., Solorio, T., Diab, M., Sitaram, S., Choudhury, M., and Bali, K. (eds.) 6th Workshop on Computational Approaches to Linguistic Code-Switching pp. 74-82 (2023)
4. 4. Graham, C., Roll, N.: Evaluating OpenAI's Whisper ASR: Performance analysis across diverse accents and speaker traits. JASA Express Lett. 4 (2), (2024)
5. 5. Kuligowska, K., Stanusch, M., Koniew, M.: Challenges of Automatic Speech Recognition for medical interviews - research for Polish language. Procedia Comput. Sci. 225, pp. 1134- 1141. (2023)