Author:
Axyonov A.A., ,Ivanko D.V.,Lashkov I.B.,Ryumin D.A.,Kashevnik A.M.,Karpov A.A.
Abstract
This paper introduces a new methodology of multimodal corpus creation for audio-visual speech recognition in driver monitoring systems. Multimodal speech recognition allows using audio data when video data are useless (e.g. at nighttime), as well as applying video data in acoustically noisy conditions (e.g., at highways). The article discusses several basic scenarios when speech recognition in the vehicle environment is required to interact with the driver monitoring system. The methodology defi nes the main stages and requirements for the design of a multimodal building. The paper also describes metaparameters that the multimodal corpus must correspond to. In addition, a software package for recording an audiovisual speech corpus is described.
Publisher
Informatization and Communication Journal Editorial Board
Subject
General Agricultural and Biological Sciences