Author:
Nawasta Revanto Alif,Cahyana Nur Heri,Heriyanto Heriyanto
Abstract
Purpose: To determine emotions based on voice intonation by implementing MFCC as a feature extraction method and KNN as an emotion detection method.Design/methodology/approach: In this study, the data used was downloaded from several video podcasts on YouTube. Some of the methods used in this study are pitch shifting for data augmentation, MFCC for feature extraction on audio data, basic statistics for taking the mean, median, min, max, standard deviation for each coefficient, Min max scaler for the normalization process and KNN for the method classification.Findings/result: Because testing is carried out separately for each gender, there are two classification models. In the male model, the highest accuracy was obtained at 88.8% and is included in the good fit model. In the female model, the highest accuracy was obtained at 92.5%, but the model was unable to correctly classify emotions in the new data. This condition is called overfitting. After testing, the cause of this condition was because the pitch shifting augmentation process of one tone in women was unable to solve the problem of the training data size being too small and not containing enough data samples to accurately represent all possible input data values.Originality/value/state of the art: The research data used in this study has never been used in previous studies because the research data is obtained by downloading from Youtube and then processed until the data is ready to be used for research.
Publisher
Universitas Pembangunan Nasional Veteran Yogyakarta
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献