Abstract
Abstract
The accuracy with which naive listeners can report sentences presented 12 dB below a background of continuous prose was compared with accuracy in four audio visually supplemented conditions. With monochrome displays of the talker showing (i) the face, (ii) the lips and (iii) four points at the centres of the lips and the corners of the mouth, accuracy improved by 43, 31 and 8%, respectively. No improvement was produced by optical information on syllabic timing. The results suggest that optical concomitants of articulation specify linguistic information to normal listeners. This conclusion was reinforced in a second experiment in which identification functions were obtained for continua of synthetic syllables ranging between [aba], [ada] and [aga], presented both in isolation and in combination with video recordings. Audio-visually, [b] was only perceived when lip closure was specified optically and, if lip closure was specified optically, [b] was generally perceived. Perceivers appear to make use of articulatory constraints upon the combined audio-visual specification of phonetic events, suggesting that optical and acoustical displays are co-perceived in a common metric closely related to that of articulatory dynamics.
Subject
Linguistics and Language,Acoustics and Ultrasonics,Language and Linguistics
Cited by
247 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献