Incorporating automatic speech recognition methods into the transcription of police-suspect interviews: factors affecting automatic performance-Reference-Cited by-同舟云学术

Incorporating automatic speech recognition methods into the transcription of police-suspect interviews: factors affecting automatic performance

Published:2023-07-13 Issue: Volume:8 Page:
ISSN:2297-900X
Container-title:Frontiers in Communication
language:
Short-container-title:Front. Commun.

Author:

Harrington Lauren

Abstract

IntroductionIn England and Wales, transcripts of police-suspect interviews are often admitted as evidence in courts of law. Orthographic transcription is a time-consuming process and is usually carried out by untrained transcribers, resulting in records that contain summaries of large sections of the interview and paraphrased speech. The omission or inaccurate representation of important speech content could have serious consequences in a court of law. It is therefore clear that investigation into better solutions for police-interview transcription is required. This paper explores the possibility of incorporating automatic speech recognition (ASR) methods into the transcription process, with the goal of producing verbatim transcripts without sacrificing police time and money. We consider the potential viability of automatic transcripts as a “first” draft that would be manually corrected by police transcribers. The study additionally investigates the effects of audio quality, regional accent, and the ASR system used, as well as the types and magnitude of errors produced and their implications in the context of police-suspect interview transcripts.MethodsSpeech data was extracted from two forensically-relevant corpora, with speakers of two accents of British English: Standard Southern British English and West Yorkshire English (a non-standard regional variety). Both a high quality and degraded version of each file was transcribed using three commercially available ASR systems: Amazon, Google, and Rev.ResultsSystem performance varied depending on the ASR system and the audio quality, and while regional accent was not found to significantly predict word error rate, the distribution of errors varied substantially across the accents, with more potentially damaging errors produced for speakers of West Yorkshire English.DiscussionThe low word error rates and easily identifiable errors produced by Amazon suggest that the incorporation of ASR into the transcription of police-suspect interviews could be viable, though more work is required to investigate the effects of other contextual factors, such as multiple speakers and different types of background noise.

Funder

White Rose College of the Arts and Humanities

Publisher

Frontiers Media SA

Subject

Social Sciences (miscellaneous),Communication

Reference67 articles.

1. “Automatic transcription system for meetings of the japanese national congress,”;Akita,2009

2. Fitting linear mixed-effects models using ‘lme4.';Bates;J. Stat. Softw,2015

3. BoersmaP. WeeninkD. 31350110Praat: Doing Phonetics by Computer2022

4. Automated generation of ‘good enough' transcripts as a first step to transcription of audio-recorded data;Bokhove;Methodol. Innov,2018

5. The psychological functions of function words;Chung;Soc. Commun,2007

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automatic speech recognition and the transcription of indistinct forensic audio: how do the new generation of systems fare?;Frontiers in Communication;2024-02-14