Affiliation:
1. Indian Institute of Technology Guwahati, Guwahati-781039, India
2. Indian Institute of Technology Dharwad, Dharwad-580011, India
Abstract
This paper reports the findings of an automatic dialect identification (DID) task conducted on Ao speech data using source features. Considering that Ao is a tone language, in this study for DID, the gammatonegram of the linear prediction residual is proposed as a feature. As Ao is an under-resourced language, data augmentation was carried out to increase the size of the speech corpus. The results showed that data augmentation improved DID by 14%. A perception test conducted on Ao speakers showed better DID by the subjects when utterance duration was 3 s. Accordingly, automatic DID was conducted on utterances of various duration. A baseline DID system with the Slmsfeature attained an average F1-score of 53.84% in a 3 s long utterance. Inclusion of source features, Silprand [Formula: see text], improved the F1-score to 60.69%. In a final system, with a combination of Silpr, [Formula: see text], Slms, and Mel frequency cepstral coefficient features, the F1-score increased to 61.46%.
Publisher
Acoustical Society of America (ASA)
Subject
Acoustics and Ultrasonics,Arts and Humanities (miscellaneous)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献