Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content-Reference-Cited by-同舟云学术

Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content

Published:2020-04-15 Issue:4 Volume:11 Page:211
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Kotsakis Rigas,Matsiola Maria^ORCID,Kalliris George,Dimoulas Charalampos^ORCID

Abstract

The current paper focuses on the investigation of spoken-language classification in audio broadcasting content. The approach reflects a real-word scenario, encountered in modern media/monitoring organizations, where semi-automated indexing/documentation is deployed, which could be facilitated by the proposed language detection preprocessing. Multilingual audio recordings of specific radio streams are formed into a small dataset, which is used for the adaptive classification experiments, without seeking—at this step—for a generic language recognition model. Specifically, hierarchical discrimination schemes are followed to separate voice signals before classifying the spoken languages. Supervised and unsupervised machine learning is utilized at various windowing configurations to test the validity of our hypothesis. Besides the analysis of the achieved recognition scores (partial and overall), late integration models are proposed for semi-automatically annotation of new audio recordings. Hence, data augmentation mechanisms are offered, aiming at gradually formulating a Generic Audio Language Classification Repository. This database constitutes a program-adaptive collection that, beside the self-indexing metadata mechanisms, could facilitate generic language classification models in the future, through state-of-art techniques like deep learning. This approach matches the investigatory inception of the project, which seeks for indicators that could be applied in a second step with a larger dataset and/or an already pre-trained model, with the purpose to deliver overall results.

Publisher

MDPI AG

Subject

Information Systems

Link

https://www.mdpi.com/2078-2489/11/4/211/pdf

Reference48 articles.

1. Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification

2. Contribution of Stereo Information to Feature-Based Pattern Classification for Audio Semantic Analysis

3. 1D/2D Deep CNNs vås. Temporal Feature Integration for General Audio Classification

4. Investigation of an Encoder-Decoder LSTM model on the enhancement of speech intelligibility in noise for hearing-impaired listeners;Thoidis,2019

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Language Detection Based on Audio for Indian Languages;Automatic Speech Recognition and Translation for Low Resource Languages;2024-03-29

2. A Deep Learning Approach for Identifying and Discriminating Spoken Arabic Among Other Languages;IEEE Access;2023

3. The Challenge of an Interactive Audiovisual-Supported Lesson Plan: Information and Communications Technologies (ICTs) in Adult Education;Education Sciences;2022-11-19

4. Spoken Language Identification System Using Convolutional Recurrent Neural Network;Applied Sciences;2022-09-13

5. Extending Radio Broadcasting Semantics through Adaptive Audio Segmentation Automations;Knowledge;2022-07-18