1. Neural speaker diarization with pyannote.audio. https://github.com/pyannote/pyannote-audio
2. Speech recognition on common voice 8.0 spanish. https://paperswithcode.com/sota/speech-recognition-on-common-voice-8-0-16
3. Speech to text: A speech service feature that accurately transcribes spoken audio to text; azure cognitive services. https://azure.microsoft.com/es-es/services/cognitive-services/speech-to-text/#features
4. Speech-to-text: Automatic speech recognition; google cloud. https://cloud.google.com/speech/
5. Huggingface trainer (2021). https://huggingface.co/docs/transformers/main_classes/trainer