A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images-Reference-Cited by-同舟云学术

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Published:2021-07-20 Issue:1 Volume:8 Page:
ISSN:2052-4463
Container-title:Scientific Data
language:en
Short-container-title:Sci Data

Author:

Lim Yongwan^ORCID,Toutios Asterios^ORCID,Bliesener Yannick,Tian Ye^ORCID,Lingala Sajan Goud,Vaz Colin,Sorensen Tanner^ORCID,Oh Miran^ORCID,Harper Sarah^ORCID,Chen Weiyi^ORCID,Lee Yoonjeong^ORCID,Töger Johannes^ORCID,Monteserin Mairym Lloréns,Smith Caitlin,Godinez Bianca,Goldstein Louis,Byrd Dani^ORCID,Nayak Krishna S.^ORCID,Narayanan Shrikanth S.^ORCID

Abstract

AbstractReal-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 participants performing linguistically motivated speech tasks, alongside the corresponding public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each participant.

Funder

National Science Foundation

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability

Link

http://www.nature.com/articles/s41597-021-00976-x.pdf

Reference87 articles.

1. Lingala, S. G., Sutton, B. P., Miquel, M. E. & Nayak, K. S. Recommendations for real-time speech MRI. J. Magn. Reson. Imaging 43, 28–44 (2016).

2. Scott, A. D., Wylezinska, M., Birch, M. J. & Miquel, M. E. Speech MRI: Morphology and function. Phys. Medica 30, 604–618 (2014).

3. Ramanarayanan, V. et al. Analysis of speech production real-time MRI. Comput. Speech. Lang. 52, 1–22 (2018).

4. Hagedorn, C. et al. Engineering Innovation in Speech Science: Data and Technologies. Perspect. ASHA Spec. Interes. Groups 4, 411–420 (2019).

5. Bresch, E., Kim, Y. C., Nayak, K., Byrd, D. & Narayanan, S. Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging. IEEE Signal Process. Mag. 25, 123–129 (2008).

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. NEBULA101: an open dataset for the study of language aptitude in behaviour, brain structure and function;2024-08-28

2. Interindividual vocal tract diversity influences the phonetic diversification of spoken languages;2024-07-26

3. Deep learning for accelerated and robust MRI reconstruction;Magnetic Resonance Materials in Physics, Biology and Medicine;2024-07-23

4. Speech Understanding on Tiny Devices with A Learning Cache;Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services;2024-06-03

5. Bilinguals from Larynx to Lips: Exploring Bilingual Articulatory Strategies with Anatomic MRI Data;Language and Speech;2024-04-28