The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results-Reference-Cited by-同舟云学术

The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results

Published:2022-05-23 Issue: Volume: Page:
ISSN:
Container-title:ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
language:
Short-container-title:

Author:

Chen Hang¹,Zhou Hengshun¹,Du Jun¹,Lee Chin-Hui²,Chen Jingdong³,Watanabe Shinji⁴,Siniscalchi Sabato Marco²,Scharenborg Odette⁵,Liu Di-Yuan⁶,Yin Bao-Cai⁶,Pan Jia⁶,Gao Jian-Qing⁶,Liu Cong⁶

Affiliation:

1. University of Science and Technology of China,China

2. Georgia Institute of Technology,USA

3. Northwestern Polytechnical University,China

4. Carnegie Mellon University,USA

5. Delft University of Technology,The Netherlands

6. iFlytek,China

Funder

National Natural Science Foundation of China

Publisher

IEEE

Link

http://xplorestaging.ieee.org/ielx7/9745891/9746004/09746683.pdf?arnumber=9746683

Reference35 articles.

1. Srilm–an extensible language modeling toolkit;stolcke;Proc ICSLP 2002,2004

2. The kaldi speech recognition toolkit;povey;IEEE Signal Processing Society,2011

3. Acoustic Beamforming for Speaker Diarization of Meetings

4. Nara-wpe: A python package for weighted prediction error dereverberation in numpy and tensorflow for online and offline processing;drude;Speech Communication 13th ITG-Symposium,2018

5. Audio augmentation for speech recognition

Cited by 33 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Summary on the Chat-Scenario Chinese Lipreading (ChatCLR) Challenge;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15

2. The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15

3. Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15

4. Enhancing Visual Wake Word Spotting with Pretrained Model and Feature Balance Scaling;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15

5. A Review of Key Technologies for Emotion Analysis Using Multimodal Information;Cognitive Computation;2024-06-01