Affiliation:
1. School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
Abstract
Background:
The cry is the universal language for babies to communicate with others. Infant cry classification
is a kind of speech recognition problem that should be treated wisely. In the last few years, it has been gaining its
momentum which will be very helpful for the caretaker.
Objective:
This study aims to develop infant cry classification system predictive model by converting the audio signals
into spectrogram image then implementing deep convolutional neural network. It performs end to end learning process
and thereby reducing the complexity involved in audio signal analysis and improves the performance using optimization
technique.
Method:
A time frequency-based analysis called Short Time Fourier Transform (STFT) is applied to generate the
spectrogram. 256 DFT (Discrete Fourier Transform) points are considered to compute the Fourier transform. A Deep
convolutional neural network called AlexNet with few enhancements is done in this work to classify the recorded infant
cry. To improve the effectiveness of the above mentioned neural network, Stochastic Gradient Descent with Momentum
(SGDM) is used to train the algorithm.
Results:
A deep neural network-based infant cry classification system achieves a maximum accuracy of 95% in the
classification of sleepy cries. The result shows that convolutional neural network with SGDM optimization acquires
higher prediction accuracy.
Conclusion:
Since this proposed work is compared with convolutional neural network with SGD and Naïve Bayes and
based on the result, it is implied the convolutional neural network with SGDM performs better than the other techniques.
Publisher
Bentham Science Publishers Ltd.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献