Acoustic research for telecoms: bridging the heritage to the future

Author:

Nicol RozennORCID,Monfort Jean-Yves

Abstract

In its early age, telecommunication was focused on voice communications, and acoustics was at the heart of the work related to speech coding and transmission, automatic speech recognition or speech synthesis, aiming at offering better quality (Quality of Experience or QoE) and enhanced services to users. As technology has evolved, the research themes have diversified, but acoustics remains essential. This paper gives an overview of the evolution of acoustic research for telecommunication. Communication was initially (and for a long time) only audio with a monophonic narrow-band sound (i.e. [300–3400 Hz]). After the bandwidth extension (from the wide-band [100–7000 Hz] to the full-band [20 Hz–20 kHz] range), a new break was the introduction of 3D sound, either to provide telepresence in audioconferencing or videoconferencing, or to enhance the QoE of contents such as radio, television, VOD, or video games. Loudspeaker or microphone arrays have been deployed to implement “Holophonic” or “Ambisonic” systems. The interaction between spatialized sounds and 3D images was also investigated. At the end of the 2000s, smartphones invaded our lives. Binaural sound was immediately acknowledged as the most suitable technology for reproducing 3D audio on smartphones. However, to achieve a satisfactory QoE, binaural filters need to be customized in relation with the listener’s morphology. This question is the main obstacle to a mass-market distribution of binaural sound, and its solving has prompted a large amount of work. In parallel with the development of technologies, their perceptual evaluation was an equally important area of research. In addition to conventional methods, innovative approaches have been explored for the assessment of sound spatialization, such as physiological measurement, neuroscience tools or Virtual Reality (VR). The latest development is the use of acoustics as a universal sensor for the Internet of Things (IoT) and connected environments. Microphones can be deployed, preferably with parcimony, in order to monitor surrounding sounds, with the goal of detecting information or events thanks to models of automatic sound recognition based on neural networks. Applications range from security and personal assistance to acoustic measurement of biodiversity. As for the control of environments or objects, voice commands have become widespread in recent years thanks to the tremendous progress made in speech recognition, but an even more intuitive mode based on direct control by the mind is proposed by Brain Computer Interfaces (BCIs), which rely on sensory stimulation using different modalities, among which the auditory one offers some advantages.

Funder

Orange

Publisher

EDP Sciences

Subject

Electrical and Electronic Engineering,Speech and Hearing,Computer Science Applications,Acoustics and Ultrasonics

Reference191 articles.

1. Quality of Experience in Telemeetings and Videoconferencing: A Comprehensive Survey

2. Bunz M., Meikle G.: The internet of things. Wiley, Hoboken, NJ, USA, 2017.

3. Ericsson ConsumerLab: 10 Hot Consumer Trends 2030: The internet of senses, 2019.

4. Detection and Classification of Acoustic Scenes and Events: https://dcase.community. Accessed 27.11.2023.

5. Huang X., Baker J., Reddy R.: A historical perspective of speech recognition. Communications of the ACM 57, 1 (2014).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3