Affiliation:
1. SASTRA Deemed University: Shanmugha Arts Science Technology and Research Academy
Abstract
Abstract
Recognizing linguistic information from speech has found applications in interpretation of language in which the utterance is spoken and the system could be used as a translator to convert sentence spoken in one language into another language meaningfully. Real time implementation of language identification (LID) from speech requires the speech to be fed from the Raspberry Pi board used in the transmitter section and the Raspberry Pi board in the receiver section receives it and given to the system for identifying the language of the speech. This system requires the training phase in which two dimensional spectrogram features are derived from the training set of speeches and given to the CNN layered architecture for creating templates for languages. Testing phase involves the transmission of speech from the memory card of the Raspberry Pi board in transmitter system. Raspberry Pi board in the receiver receives it and given to the system in receiver section. Two dimensional spectrogram features are derived for test speech and given to the CNN templates and based on the similarity index, test language is interpreted. This system is implemented using spectrogram, Melspectrogram and ERB spectrogram as features and CNN for modeling and classification of languages. Validation error is 1.4%, 1.8% and 3% for spectrogram, Melspectrogram and ERB spectrogram based systems respectively and decision level fusion classifier gives 0.9% as validation error. This system can be implemented in hardware by using Raspberry Pi board. This automated real time multilingual language identification system would be useful in forensic department and defense sectors to identify the persons belonging to any region or speaking in any language.
Publisher
Research Square Platform LLC