Affiliation:
1. Department of Computer Engineering, Sirjan University of Technology, Sirjan, Iran
2. Department of Electrical Engineering, Sirjan University of Technology, Sirjan, Iran
Abstract
Background:
Today, the automatic intelligent system requirement has caused an increasing consideration
on the interactive modern techniques between human being and machine. These techniques generally
consist of two types: audio and visual methods. Meanwhile, the need for developing the algorithms
that enable the human speech recognition by machine is of high importance and frequently studied by the
researchers.
Objective:
Using artificial intelligence methods has led to better results in human speech recognition, but the
basic problem is the lack of an appropriate strategy to select the recognition data among the huge amount of
speech information that practically makes it impossible for the available algorithms to work.
Method:
In this article, to solve the problem, the linear predictive coding coefficients extraction method is
used to sum up the data related to the English digits pronunciation. After extracting the database, it is utilized
to an Elman neural network to recognize the relation between the linear coding coefficients of an audio
file with the pronounced digit.
Results:
The results show that this method has a good performance compared to other methods. According
to the experiments, the obtained results of network training (99% recognition accuracy) indicate that the
network still has better performance than RBF despite many errors.
Conclusion:
The results of the experiments showed that the Elman memory neural network has had an acceptable
performance in recognizing the speech signal compared to the other algorithms. The use of the linear
predictive coding coefficients along with the Elman neural network has led to higher recognition accuracy
and improved the speech recognition system.
Publisher
Bentham Science Publishers Ltd.
Reference24 articles.
1. Eisenstein E. L.; The printing press as an agent of change, Cambridge, UK: Cambridge University Press, Amazon Vol. 1, Jul. 1980, pp. 55-67.
2. Zhu D.; Nakamura S.; Paliwal K.K.; Wang R.; Maximum likelihood sub-band adaptation for robust speech recognition. Speech Commun Nov. 2005,47(3),243-264
3. Nidhyananthan S.S.; Shenbagalakshmi V.; Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition. Int J Speech Technol September 2016,19(3),577-583
4. Alkhasawneh M.S.; Tay L.T.; A hybrid intelligent system integrating the cascade forward neural network with Elman neural network. Arab J Sci Eng September 2017,10(1),1-13
5. Zhu D.; Nakamura S.; Paliwal K. K.; Wang R.; Maximum likelihood sub-band adaptation for robust speech recognition, Speech Commun Vol. 47, pp. 243-264, No. 3, November 2005
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献