Automatic transcription system for parliamentary debates in the context of assembly of the republic of Portugal-Reference-Cited by-同舟云学术

Automatic transcription system for parliamentary debates in the context of assembly of the republic of Portugal

Published:2024-07-17 Issue:3 Volume:27 Page:613-635
ISSN:1381-2416
Container-title:International Journal of Speech Technology
language:en
Short-container-title:Int J Speech Technol

Author:

Nascimento Pedro^ORCID,Ferreira João C.^ORCID,Batista Fernando^ORCID

Abstract

AbstractThe transcription of parliamentary proceedings is essential for democratic governance. Traditional methods are manual and time-consuming. This work introduces an Automatic Transcription System for the Assembly of the Republic of Portugal (STAAR) that uses an automatic speech recognition model and speaker diarization technologies. STAAR was developed after analyzing existing technologies and the Assembly’s specific needs, leading to an effective solution that integrates with current processes. STAAR stands out for its efficiency in transcribing debates and adapting to parliamentary language nuances. It significantly exceeded expectations by presenting a low transcription error rate, ranging from 1.7 to 11.3%, depending on the context and speech style, reducing the time required to produce the official parliamentary debates journal, and improving overall transcription efficiency. Additionally, STAAR enabled the transcription of previously undocumented parliamentary committee meetings, enhancing the documentation of parliamentary activities. This achievement marks a significant step in modernizing parliamentary processes, increasing transparency and accessibility of political information, and positions the Portuguese Parliament at the forefront of technological innovation in parliamentary debates transcription.

Funder

ISCTE – Instituto Universitário

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10772-024-10126-4.pdf

Reference38 articles.

1. 14:00-17:00 ISO/IEC 15504-2. (2003). Retrieved 23 Oct 2023, from https://www.iso.org/standard/37458.html

2. Alumaë, T., Tilk, O., & Ullah, A. (2018). Advanced rich transcription system for Estonian speech. Frontiers in Artificial Intelligence and Applications, 307, 8.

3. Baevski, A., Zhou, H., Mohamed, A., Auli, M. (2020). Wav2vec 2.0: A framework for self-supervised learning of speech representations. In 34th Conference on neural information processing systems (NeurIPS 2020), (Vol. 2020), Vancouver, Canada.

4. Bain, M., Huh, J., Han, T., & Zisserman, A. (2023). WhisperX: Time-accurate speech transcription of long-form audio. https://doi.org/10.48550/arXiv.2303.00747

5. Bredin, H., Yin, R., Coria, J. M., Gelly, G., Korshunov, P., Lavechin, M., Fustes, D., Titeux, H., Bouaziz, W., & Gill, M.-P. (2019). Pyannote.Audio: Neural building blocks for speaker diarization.